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Introduction 


Theoretical physics is all about casting our concepts about the real world into rigorous 
mathematical form, for better or worse. But theoretical physical doesn’t do that for its 
own sake. It does so in order to fully explore the implications of what our concepts about 
the real world are. So, to a certain extent, the spirit of theoretical physics can be cast into 
the words of Wittgenstein who said: “What we cannot speak about |clearly| we must pass 
over in silence.” Indeed, if we have concepts about the real world and it is not possible to 
cast them into rigorous mathematical form, that is usually an indicator that some aspects 
of these concepts have not been well understood. 

Theoretical physical aims at casting these concepts into mathematical language. But 
then, mathematics is just that: it is just a language. If we want to extract physical con- 
clusions from this formulation, we must interpret the language. That is not the purpose 
or task of mathematics, that is the task of physicists. That is where it gets difficult. But 
then, again, mathematics is just a language and, going back to Wittgenstein, he said: “The 
theorems of mathematics all say the same. Namely, nothing.” What did he mean by that? 
Well, obviously, he did not mean that mathematics is useless. He just referred to the fact 
that if we have a theorem of the type “A if, and only if, B’, where A and B are propo- 
sitions, then obviously B says nothing else that A does, and A says nothing else than B 
does. It is a tautology. However, while from the point of view of logic and mathematics it 
is a tautology, psychologically, in terms of our understanding of A, it may be very useful to 
have a reformulation of A in terms of B. 

Thus, with the understanding that mathematics just gives us a language for what we 
want to do, the idea of this course is to provide proper language for theoretical physics. 
In particular, we will provide the proper mathematical language for classical mechanics, 
electromagnetism, quantum mechanics and statistical physics. We are not going to revise 
all the mathematics that is needed for these four subjects, but rather we will develop the 


mathematics from a higher point of view assuming some prior knowledge of these subjects. 


1 Logic of propositions and predicates 


1.1 Propositional logic 


Definition. A proposition p is a variable! that can take the values true (T) or false (F), 
and no others. 


This is what a proposition is from the point of view of propositional logic. In particular, 
it is not the task of propositional logic to decide whether a complex statement of the form 
“there is extraterrestrial life’ is true or not. Propositional logic already deals with the 
complete proposition, and it just assumes that is either true or false. It is also not the task 
of propositional logic to decide whether a statement of the type “in winter is colder than 
outside” is a proposition or not (i.e. if it has the property of being either true or false). In 
this particular case, the statement looks rather meaningless. 


Definition. A proposition which is always true is called a tautology, while one which is 
always false is called a contradiction. 


It is possible to build new propositions from given ones using logical operators. The 
simplest kind of logical operators are unary operators, which take in one proposition and 
return another proposition. There are four unary operators in total, and they differ by the 
truth value of the resulting proposition which, in general, depends on the truth value of p. 


We can represent them in a table as follows: 


where — is the negation operator, id is the identity operator, T is the tautology oper- 
ator and is the contradiction operator. These clearly exhaust all possibilities for unary 
operators. 

The next step is to consider binary operators, i.e. operators that take in two propo- 
sitions and return a new proposition. There are four combinations of the truth values of 
two propositions and, since a binary operator assigns one of the two possible truth values 
to each of those, we have 16 binary operators in total. The operators A, V and Y, called 


and, or and exclusive or respectively, should already be familiar to you. 


p|q||pAq|pva|pYq 
F|F || F F F 
Bele ||| ae ak T 
TT |B ||) T T 
Te |||, ae a F 


There is one binary operator, the implication operator =, which is sometimes a little 
ill understood, unless you are already very knowledgeable about these things. Its usefulness 
comes in conjunction with the equivalence operator <=. We have: 


'By this we mean a formal expression, with no extra structure assumed. 


P|@d|prqiprd 
F| F T T 
F | T T F 
T|F F F 
T/T T T 


While the fact that the proposition p => q is true whenever p is false may be surprising 
at first, it is just the definition of the implication operator and it is an expression of the 
principle “Ex falso quod libet”, that is, from a false assumption anything follows. Of course, 
you may be wondering why on earth we would want to define the implication operator in 


this way. The answer to this is hidden in the following result. 


Theorem 1.1. Let p,q be propositions. Then (p => q) = ((7q¢) => (=p)). 


Proof. We simply construct the truth tables for p => q and (—=q) => (=p). 


p|-q|p=>q | (-q) > (Pp) 


HHay/st 
oe 
WyW A 
MPa wHA 
AeA H 
AeA H 


The columns for p => q and (7q) = (=p) are identical and hence we are done. 
Remark 1.2. We agree on decreasing binding strength in the sequence: 
4,A,V,>,¢. 


For example, (=q) = (=p) may be written unambiguously as aq => 7p. 


Remark 1.3. All higher order operators O(pi,...,pn) can be constructed from a single 
binary operator defined by: 


AHA ys 
HaHa A}sS 
MHAH|> 


This is called the nand operator and, in fact, we have (pt q) @ 7(pAq). 


1.2 Predicate logic 


Definition. A predicate is (informally) a proposition-valued function of some variable or 


variables. In particular, a predicate of two variables is called a relation. 


For example, P(a) is a proposition for each choice of the variable x, and its truth value 
depends on x. Similarly, the predicate Q(x, y) is, for any choice of x and y, a proposition 
and its truth value depends on x and y. 

Just like for propositional logic, it is not the task of predicate logic to examine how 
predicates are built from the variables on which they depend. In order to do that, one 
would need some further language establishing the rules to combine the variables x and 
y into a predicate. Also, you may want to specify from which “set” x and y come from. 
Instead, we leave it completely open, and simply consider x and y formal variables, with 
no extra conditions imposed. 

This may seem a bit weird since from elementary school one is conditioned to always 
ask where "x" comes from upon seeing an expression like P(x). However, it is crucial that 
we refrain from doing this here, since we want to only later define the notion of set, using 
the language of propositional and predicate logic. As with propositions, we can construct 
new predicates from given ones by using the operators define in the previous section. For 
example, we might have: 


Q(x, y, 2) > P(x) A Ry, 2), 
where the symbol : means “defined as being equivalent to”. More interestingly, we can 
construct a new proposition from a given predicate by using quantifiers. 


Definition. Let P(x) be a predicate. Then: 
Ver Pe, 


is a proposition, which we read as “for all x, P of x (is true)”, and it is defined to be true if 
P(a) is true independently of x, false otherwise. The symbol V is called universal quantifier. 


Definition. Let P(x) be a predicate. Then we define: 


“ee Pl eis a9 6s aR (2) 


The proposition 3x2 : P(x) is read as “there exists (at least one) x such that P of x (is 
” 


true)” and the symbol 4 is called existential quantifier. 
The following result is an immediate consequence of these definitions. 


Corollary 1.4. Let P(x) be a predicate. Then: 


Vow Pay) Saas aP ey). 
Remark 1.5. It is possible to define quantification of predicates of more than one variable. 
In order to do so, one proceeds in steps quantifying a predicate of one variable at each step. 


Example 1.6. Let P(x,y) be a predicate. Then, for fixed y, P(x,y) is a predicate of one 
variable and we define: 


Qy) 4 Va: Pay). 
Hence we may have the following: 
qy3 Nias P(eay) ie ag Oy): 


Other combinations of quantifiers are defined analogously. 


Remark 1.7. The order of quantification matters (if the quantifiers are not all the same). 


For a given predicate P(x, y), the propositions: 


ay Vee Plea) and. Ve taty oP aay) 


are not necessarily equivalent. 


Example 1.8. Consider the proposition expressing the existence of additive inverses in the 


real numbers. We have: 


Va:dy: «+y=0, 


i.e. for each x there exists an inverse y such that « + y = 0. For 1 this is —1, for 2 it is 
—2 etc. Consider now the proposition obtained by swapping the quantifiers in the previous 


proposition: 


dy:Va: a+y=0. 
What this proposition is saying is that there exists a real number y such that, no 


matter what x is, we have x + y = 0. This is clearly false, since if x + y = 0 for some x 
then (« +1) + y #40, so the same y cannot work for both x and xz + 1, let alone every z. 


Notice that the proposition 4x : P(x) means “there exists at least one x such that P(x) 
is true”. Often in mathematics we prove that “there exists a unique x such that P(z) is 
true’. We therefore have the following definition. 


Definition. Let P(x) be a predicate. We define the unique existential quantifier A! by: 


Ale: Plas (sus PAV ys VzZ3(POU)APE) => y=2), 


This definition clearly separates the existence condition from the uniqueness condition. 


An equivalent definition with the advantage of brevity is: 


dla Pee (eee Pig) y) 
1.3 Axiomatic systems and theory of proofs 


Definition. An aziomatic system is a finite sequence of propositions a1, a2,...,ay, which 


are called the axioms of the system. 


Definition. A proof of a proposition p within an axiomatic system a1, a2,...,@y is a finite 
sequence of propositions q1,q2,...,q. such that qqyy = p and for any 1 < 7 < M one of the 
following is satisfied: 


(A) qj is a proposition from the list of axioms; 


(T) q; is a tautology; 


(M) dl <m,n<j:(dmA dn => q) is true. 
Remark 1.9. If p can be proven within an axiomatic system a1, d2,...,an, we write: 
Qa1,92,..-,QN Fp 


and we read “a1, a9,...,@nN proves p”. 


Remark 1.10. This definition of proof allows to easily recognise a proof. A computer could 
easily check that whether or not the conditions (A), (T) and (M) are satisfied by a sequence 
of propositions. To actually find a proof of a proposition is a whole different story. 


Remark 1.11. Obviously, any tautology that appears in the list of axioms of an axiomatic 
system can be removed from the list without impairing the power of the axiomatic system. 


An extreme case of an axiomatic system is propositional logic. The axiomatic system for 
propositional logic is the empty sequence. This means that all we can prove in propositional 
logic are tautologies. 


Definition. An axiomatic system a1,a2,...,a@y is said to be consistent if there exists a 
proposition g which cannot be proven from the axioms. In symbols: 


Fg Ag Ooyudes rg): 


The idea behind this definition is the following. Consider an axiomatic system which 
contains contradicting propositions: 


Q1,---,,-.+,78,--.-,QN. 
Then, given any proposition gq, the following is a proof of q within this system: 
8,78, q. 


Indeed, s and 7s are legitimate steps in the proof since they are axioms. Moreover, s A 7s 
is a contradiction and thus (s \ 4s) = q is a tautology. Therefore, q follows from condition 
(M). This shows that any proposition can be proven within a system with contradictory 
axioms. In other words, the inability to prove every proposition is a property possessed by 
no contradictory system, and hence we define a consistent system as one with this property. 

Having come this far, we can now state (and prove) an impressively sounding theorem. 


Theorem 1.12. Propositional logic is consistent. 


Proof. Suffices to show that there exists a proposition that cannot be proven within propo- 
sitional logic. Propositional logic has the empty sequence as axioms. ‘Therefore, only 
conditions (T) and (M) are relevant here. The latter allows the insertion of a proposition 
qj such that (qm A dn) > qj is true, where gm and gp are propositions that precede qj; in 
the proof sequence. However, since (T) only allows the insertion of a tautology anywhere 
in the proof sequence, the propositions qm and gn must be tautologies. Consequently, for 
(dm dn) > qj to be true, qj must also be a tautology. Hence, the proof sequence consists 
entirely of tautologies and thus only tautologies can be proven. 

Now let q be any proposition. Then q A 7 q is a contradiction, hence not a tautology 


and thus cannot be proven. Therefore, propositional logic is consistent. 


Remark 1.13. While it is perfectly fine and clear how to define consistency, it is perfectly 
difficult to prove consistency for a given axiomatic system, propositional logic being a big 
exception. 


Theorem 1.14. Any axiomatic system powerful enough to encode elementary arithmetic 
is either inconsistent or contains an undecidable proposition, i.e. a proposition that can be 
neither proven nor disproven within the system. 


An example of an undecidable proposition is the Continuum hypothesis within the 
Zermelo-Fraenkel axiomatic system. 


2 Axioms of set theory 


2.1 The €-relation 


Set theory is built on the postulate that there is a fundamental relation (i.e. a predicate of 
two variables) denoted € and read as “epsilon”. There will be no definition of what € is, or 
of what a set is. Instead, we will have nine axioms concerning € and sets, and it is only in 
terms of these nine axioms that € and sets are defined at all. Here is an overview of the 


axioms. We will have: 


e 2 basic existence axioms, one about the € relation and the other about the existence 


of the empty set; 


e 4 construction axioms, which establish rules for building new sets from given ones. 
They are the pair set axiom, the union set axiom, the replacement axiom and the 


power set axiom; 


e 2 further existence/construction axioms, these are slightly more advanced and newer 


compared to the others; 
e 1 axiom of foundation, excluding some constructions as not being sets. 
Using the €-relation we can immediately define the following relations: 
exrdéy:@-7(r € y) 
erCcy:6Va:(aersaey) 
e273 Cay Aeon) 
exrcy: (x&Cy)AA(a=y) 


Remark 2.1. A comment about notation. Since € is a predicate of two variables, for 
consistency of notation we should write €(x,y). However, the notation x € y is much more 


common (as well as intuitive) and hence we simply define: 
rey: E(x, y) 


and we read “zx is in (or belongs to) y” or “x is an element (or a member) of y”. Similar 


remarks apply to the other relations ¢, C and =. 


2.2  Zermelo-Fraenkel axioms of set theory 


Axiom on the €-relation. The expression x € y is a proposition if, and only if, both « 


and y are sets. In symbols: 
Va:Vy:(@ey)V¥ Ale ey). 


We remarked, previously, that it is not the task of predicate logic to inquire about the nature 
of the variables on which predicates depend. This first axiom clarifies that the variables on 


which the relation € depend are sets. In other words, if x € y is not a proposition (i.e. it 
does not have the property of being either true or false) then x and y are not both sets. 

This seems so trivial that, for a long time, people thought that this not much of a 
condition. But, in fact, it is. It tells us when something is not a set. 


Example 2.2 (Russell’s paradox). Suppose that there is some u which has the following 


property: 
Va:(t¢gaeaxeu), 


i.e. u contains all the sets that are not elements of themselves, and no others. We wish to 
determine whether u is a set or not. In order to do so, consider the expression u € u. If u 
is a set then, by the first axiom, u € u is a proposition. 

However, we will show that his is not the case. Suppose first that wu € u is true. Then 
a(u € wu) is true and thus u does not satisfy the condition for being an element of u, and 
hence is not an element of u. Thus: 


ucu>n(ueu) 


and this is a contradiction. Therefore, u € u cannot be true. Then, if it is a proposition, 
it must be false. However, if u ¢ u, then wu satisfies the condition for being a member of u 
and thus: 

u¢gu=>-(u ¢ u) 


which is, again, a contradiction. Therefore, u € u does not have the property of being either 
true or false (it can be neither) and hence it is not a proposition. Thus, our first axiom 
implies that u is not a set, for if it were, then u € u would be a proposition. 


Remark 2.3. The fact that u as defined above is not a set means that expressions like: 
wue€u, reu, ues, xu, ete. 


are not propositions and thus, they are not part of axiomatic set theory. 


Axiom on the existence of an empty set. There exists a set that contains no 
elements. In symbols: 


dy:Va:ac€y. 
Notice the use of “an” above. In fact, we have all the tools to prove that there is only one 
empty set. We do not need this to be an axiom. 


Theorem 2.4. There is only one empty set, and we denote it by ©. 


Proof. (Standard textbook style). Suppose that x and 2’ are both empty sets. Then y € x 
is false as x is the empty set. But then: 


(yeu) > (yer') 
is true, and in particular it is true independently of y. Therefore: 


Vy: (yer) > yer’) 


and hence x C x’. Conversely, by the same argument, we have: 


Vy: (yea) => (ye) 


and thus 2’ C x. Hence (x C 2’) A (x! C x) and therefore x = 2’. 


This is the proof that is found in standard textbooks. However, we gave a definition 
of what a proof within the axiomatic system of propositional logic is, and this “proof” does 
not satisfy our definition of proof. 

In order to give a precise proof, we first have to encode our assumptions into a sequence 
of axioms. These consist of the axioms of propositional logic (the empty sequence) plus: 


aeVvy:yEeu 
aeVvy:ydéa 


i.e. x and x’ are both empty sets. We now have to write down a (finite) sequence of 


propositions: 


qaqa... 
qa=]... 


qm exr=x' 
with M to be determined and such that, for each 1 < 7 < M one of the following is satisfied: 
(A) qj & a1 or qj & ao; 


(T) q; is a tautology; 


(M) dl <m,n<j:(dmA dm => q) is true. 


These are the three conditions that a sequence of propositions must satisfy in order to be 


a proof. 
Proof. (Formal) We begin with a tautology. 


aney¢r>Vy:yersayer) (T) 
geVy:yEeux (A) using a1 
gaeVy:(yersyer) (M) using q and q2 


The third step follows since q; A q2 => q3 is of the form: 
(p>r)Ap)>r, 


wherep @ y Gx andr Ss Vy: (yeux => y € 2’) and it is easily seen to be true by 
constructing a truth table. Moreover, by the definition of C, we may rewrite qs as: 


gexrCca’. 


—10- 


The next three steps are very similar to the first three: 


gaeyda sVy: (yeu >yer)  (T) 
peavy:yEerx' (A) using a2 
aeVvy:(yer >yex) (M) using g4 and qs 


where again, gg may be written as: 
6 a’ Cx. 
Finally, we have: 
a7 & (a Ca’) A (a’ Ca) (M) using gg and ge. 


This follows since since q3 A gg => q7 is of the form p => p which is obviously a tautology. 
Recalling the definition of =, we may rewrite q7 as: 


aqoer=x 


thereby concluding the proof in seven steps. 


Axiom on pair sets. Let x and y be sets. Then there exists a set that contains as its 
elements precisely x and y. In symbols: 


Va:Vy:dm:Vu:(uems (u=z2Vu=y)). 


The set m is called the pair set of x and y and it is denoted by {z, y}. 


Remark 2.5. We have chosen {x,y} as the notation for the pair set of x and y, but what 
about {y,x}? The fact that the definition of the pair set remains unchanged if we swap x 
and y suggests that {x,y} and {y,x} are the same set. Indeed, by definition, we have: 


(ace {z,y} > a€ {y, c}) A (ae {y, rz} > ae {z,y}) 


independently of a, hence ({z,y} C {y, z}) A ({y, x} C {a, y}) and thus {x,y} = {y, z}. 


The pair set {x,y} is thus an unordered pair. However, using the axiom on pair sets, 
it is also possible to define an ordered pair (x,y) such that (x,y) 4 (y,x). The defining 
property of an ordered pair is the following: 


(z,y) = (a,b) @r=ary=b. 
One candidate which satisfies this property is (x,y) := {x,{x,y}}, which is a set by the 
axiom on pair sets. 


Remark 2.6. The pair set axiom also guarantees the existence of one-element sets, called 
singletons. If x is a set, then we define {x} := {a,x}. Informally, we can say that {x} and 
{x,x} express the same amount of information, namely that they contain z. 


—11-—- 


Axiom on union sets. Let x be a set. Then there exists a set whose elements are 
precisely the elements of the elements of x. In symbols: 


Ve:du:Vy:(yeuseds:(yesAses)) 


The set u is denoted by Uz. 


Example 2.7. Let a,b be sets. Then {a} and {b} are sets by the pair set axiom, and hence 
x := {{a}, {b}} is a set, again by the pair set axiom. Then the expression: 


Jax = {a,b} 


is a set by the union axiom. 


Notice that, since a and 6 are sets, we could have immediately concluded that {a, b} 
is a set by the pair set axiom. The union set axiom is really needed to construct sets with 
more than 2 elements. 


Example 2.8. Let a,b,c be sets. Then {a} and {b,c} are sets by the pair set axiom, and 
hence x := {{a}, {b,c}} is a set, again by the pair set axiom. Then the expression: 


GE: = tab, ot 


is a set by the union set axiom. This time the union set axiom was really necessary to 
establish that {a, b,c} is a set, i.e. in order to be able to use it meaningfully in conjunction 
with the €-relation. 


The previous example easily generalises to a definition. 


Definition. Let a1,a2,...,ay be sets. We define recursively for all N > 2: 


{a1,02,...,an4i} =|) {{a1,02,...,an}, {ansi}}. 


Remark 2.9. The fact that the x that appears in [J z has to be a set is a crucial restriction. 
Informally, we can say that it is only possible to take unions of as many sets as would fit 
into a set. The “collection” of all the sets that do not contain themselves is not a set or, we 
could say, does not fit into a set. Therefore it is not possible to take the union of all the 
sets that do not contain themselves. This is very subtle, but also very precise. 


Axiom of replacement. Let R be a functional relation and let m be a set. Then the 
image of m under R, denoted by imr(m), is again a set. 

Of course, we now need to define the new terms that appear in this axiom. Recall that 
a relation is simply a predicate of two variables. 


Definition. A relation R is said to be functional if: 


Vol ge Alig): 


Definition. Let m be a set and let R be a functional relation. The image of m under R 
consists of all those y for which there is an x € m such that R(a, y). 
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None of the previous axioms imply that the image of a set under a functional relation is 
again a set. The assumption that it always is, is made explicit by the axiom of replacement. 

Is is very likely that the reader has come across a weaker form of the axiom of replace- 
ment, called the principle of restricted comprehension, which says the following. 


Proposition 2.10. Let P(x) be a predicate and let m be a set. Then the elements y © m 
such that P(y) is true constitute a set, which we denote by: 


{yem| P(y)}. 


Remark 2.11. The principle of restricted comprehension is not to be confused with the 
“principle” of universal comprehension which states that {y | P(y)} is a set for any predicate 
and was shown to be inconsistent by Russell. Observe that the y € m condition makes it 
so that {y € m| P(y)} cannot have more elements than m itself. 


Remark 2.12. If y is a set, we define the following notation: 
Vaey:P(2):eVa:(xey=> P(a)) 


and: 


devey: P(x) :3 7A(Vaey:-P(z)). 


Pulling the — through, we can also write: 


dzey: P(r) S-7(Va2 € y: 4P(z)) 
= (Va: (4€y=> -P(s))) 
Ae Ae €y = aP(z))) 


(x €yA P(2)), 


x: 
SIix: 


where we have used the equivalence (p => q) (p A 7q). 


The principle of restricted comprehension is a consequence of the axiom of replacement. 


Proof. We have two cases. 


1. If s(4y € m: P(y)), then we define: {y € m| P(y)} := 2. 


2. If iy em: P(%), then let R be the functional relation: 


R(x, y) = (Pla) Ne =y) V (P(x) NG = y) 


and hence define {y € m | P(y)} := imr(m). 
Don’t worry if you don’t see this immediately. You need to stare at the definitions for 
a while and then it will become clear. 


Remark 2.13. We will rarely invoke the axiom of replacement in full. We will only invoke 
the weaker principle of restricted comprehension, with which we are all familiar with. 


We can now define the intersection and the relative complement of sets. 


=/13'= 


Definition. Let x be a set. Then we define the intersection of x by: 


()2:={ae J2|Vbenr:ae€ db}. 
If a,b € x and (|x = @, then a and b are said to be disjoint. 


Definition. Let u and m be sets such that u C m. Then the complement of u relative to 
m is defined as: 
m\u:={xEem|a¢u}. 


These are both sets by the principle of restricted comprehension, which is ultimately due 


to axiom of replacement. 


Axiom on the existence of power sets. Let m be a set. Then there exists a set, 
denoted by P(m), whose elements are precisely the subsets of m. In symbols: 


Versy:Ve: (ae yaa Cx): 


Historically, in natIve set theory, the principle of universal comprehension was thought 
to be needed in order to define the power set of a set. Traditionally, this would have been 
(inconsistently) defined as: 


P(m) := {y | y Gm}. 
To define power sets in this fashion, we would need to know, a priori, from which “bigger” 
set the elements of the power set come from. However, this in not possible based only on 
the previous axioms and, in fact, there is no other choice but to dedicate an additional 
axiom for the existence of power sets. 


Example 2.14. Let m = {a,b}. Then P(m) = {@, {a}, {b}, {a, b}}. 
Remark 2.15. If one defines (a,b) := {a,{a,b}}, then the cartesian product x x y of two 


sets x and y, which informally is the set of all ordered pairs of elements of x and y, satisfies: 


xxyCP(P(_) {x,y})). 
Hence, the existence of « x y as a set follows from the axioms on unions, pair sets, power 
sets and the principle of restricted comprehension. 


Axiom of infinity. There exists a set that contains the empty set and, together with 
every other element y, it also contains the set {y} as an element. In symbols: 


de: G@exdAVy:(yer=> {yf ea). 


Let us consider one such set x. Then @ € « and hence {@} € «x. Thus, we also have 
{{@}} € x and so on. Therefore: 


© ={S,{S},{{Shh, {IHF -- J. 


We can introduce the following notation for the elements of x: 


O:= 9, L:={O}, 2={{9}}, 3= {to}, 
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Corollary 2.16. The “set” N := x is a set according to axiomatic set theory. 


This would not be then case without the axiom of infinity since it is not possible to 
prove that N constitutes a set from the previous axioms. 
Remark 2.17. At this point, one might suspect that we would need an extra axiom for the 
existence of the real numbers. But, in fact, we can define R := P(N), which is a set by the 
axiom on power sets. 
Remark 2.18. The version of the axiom of infinity tat we stated is the one that was first 
put forward by Zermelo. A more modern formulation is the following. There exists a set 
that contains the empty set and, together with every other element y, it also contains the 
set yU {y} as an element. Here we used the notation: 


eye) | faa: 


With this formulation, the natural numbers look like: 


N:= {2 {9}, {F, {FH}, {9,19}, {9 {}}},-.-} 


This may appear more complicated than what we had before, but it is much nicer for two 
reasons. First, the natural number n is represented by an n-element set rather than a one- 
element set. Second, it generalizes much more naturally to the system of transfinite ordinal 
numbers where the successor operation s(x) = xU {a} applies to transfinite ordinals as well 
as natural numbers. Moreover, the natural numbers have the same defining property as the 
ordinals: they are transitive sets strictly well-ordered by the €-relation. 

Axiom of choice. Let x be a set whose elements are non-empty and mutually disjoint. 
Then there exists a set y which contains exactly one element of each element of x. In 


symbols: 


Va: P(a4)>dy:Vaeu:Aabea:acy, 
where P(x)  (da:a€ax)A(Va:Vb: (aExAbEexz) > ()\{a,b} = 2). 


Remark 2.19. The axiom of choice is independent of the other 8 axioms, which means 


that one could have set theory with or without the axiom of choice. However, standard 
mathematics uses the axiom of choice and hence so will we. There is a number of theorems 
that can only be proved by using the axiom of choice. Amongst these we have: 


e every vector space has a basis; 


e there exists a complete system of representatives of an equivalence relation. 


Axiom of foundation. Every non-empty set x contains an element y that has none 


of its elements in common with x. In symbols: 


Va: (Ja:a€a)>Jyeau:f[ \{x,y} =2. 


An immediate consequence of this axiom is that there is no set that contains itself as an 


element. 


The totality of all these nine axioms are called ZFC set theory, which is a shorthand 
for Zermelo-Fraenkel set theory with the axiom of Choice. 
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3 Classification of sets 


3.1 Maps between sets 


A recurrent theme in mathematics is the classification of spaces by means of structure- 
preserving maps between them. 

A space is usually meant to be some set equipped with some structure, which is usually 
some other set. We will define each instance of space precisely when we will need them. In 
the case of sets considered themselves as spaces, there is no extra structure beyond the set 
and hence, the structure may be taken to be the empty set. 


Definition. Let A,B be sets. A map ¢: A > B is a relation such that for each a € A 
there exists exactly one b € B such that ¢(a, b). 


The standard notation for a map is: 
¢: A+B 
at+ (a) 


which is technically an abuse of notation since ¢, being a relation of two variables, should 
have two arguments and produce a truth value. However, once we agree that for each a € A 
there exists exactly one b € B such that ¢(a, 6) is true, then for each a we can define ¢(a) 
to be precisely that unique 6. It is sometimes useful to keep in mind that ¢ is actually a 
relation. 


Example 3.1. Let M be a set. The simplest example of a map is the identity map on M: 


idyv: M7> M 


mrym. 
The following is standard terminology for a map ¢: A > B: 
e the set A is called the domain of ¢; 
e the set B is called the target of ¢; 
e the set (A) = img(A) := {¢(a) | a € A} is called the image of A under ¢. 


Definition. A map ¢: A — B is said to be: 


e injective if Va1,a2 € A: (a1) = d(a2) > ar = a9; 


e surjective if img(A) = B; 
e bijective if it is both injective and surjective. 


Definition. Two sets A and B are called (set-theoretic) isomorphic if there exists a bijection 
o: A B. In this case, we write A. B. 


Remark 3.2. If there is any bijection A — B then generally there are many. 
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Bijections are the “structure-preserving’ maps for sets. Intuitively, they pair up the 
elements of A and B and a bijection between A and B exists only if A and B have the same 
“size”. This is clear for finite sets, but it can also be extended to infinite sets. 


Definition (Classification of sets). A set A is: 


e infinite if there exists a proper subset B C A such that B &,., A. In particular, if A 
is infinite, we further define A to be: 


* countably infinite if A S.o4 N; 


* uncountably infinite otherwise. 


e finite if it is not infinite. In this case, we have A “get {1,2,...,N} for some N € N 
and we say that the cardinality of A, denoted by |A], is N. 


Given two maps ¢: A > B and w: B > C, we can construct a third map, called the 
composition of @ and w, denoted by yo ¢ (read “psi after phi”), defined by: 


wood: AOC 
a ~(o(a)). 


This is often represented by drawing the following diagram 


B 
a: 
A ——__> 
pod e 
and by saying that “the diagram commutes” (although sometimes this is assumed even if 
it is not explicitly stated). What this means is that every path in the diagram gives the 
same result. This might seem notational overkill at this point, but later we will encounter 


situations where we will have many maps, going from many places to many other places 


and these diagrams greatly simplify the exposition. 
Proposition 3.3. Composition of maps is associative. 
Proof. Indeed, let ¢: A > B, Ww: B > C and €: C > D be maps. Then we have: 


Eo(wod): A> D 
a + &(v((a))) 


and: 


(Eo"v~)o0¢:A>D 
a r+ &(v(9(a))). 


Thus € 0 (od) = (0) o¢. 
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The operation of composition is necessary in order to defined inverses of maps. 


Definition. Let ¢: A > B be a bijection. Then the inverse of ¢, denoted @~1, is defined 
(uniquely) by: 

g op=ida 

pop! =idg. 

Equivalently, we require the following diagram to commute: 
6 
idg4@ A oe ee B Dids 
Re 
go} 

The inverse map is only defined for bijections. However, the following notion, which we will 
often meet in topology, is defined for any map. 


Definition. Let ¢: A — B be a map and let V C B. Then we define the set: 
preim,(V) := {ae A| g(a) € V} 
called the pre-image of V under ¢. 


Proposition 3.4. Let ¢: A — B be a map, let U,V C B and C = {C; | j € J} C P(B). 
Then: 


i) preim,(@) = @ and preimy(B) = A; 
wi) preimy(U \ V) = preim,(U) \ preim,(V); 
ww) preimg(UC) = Uje;preimg(C;) and preimg(() C) = () <7 preimg(C)). 
Proof. i) By definition, we have: 
preim,(B) = {a¢ A: ¢(a)e B} =A 
and: 
preim,(@) = {a€ A: ¢(a) € SB} =. 
ii) We have: 
a€ preimg(U \V) + ¢(a) EU \V 
= (a) CUA d(a) EV 
#ae€preim,(U) A a ¢ preimy(V) 
=a € preimy(U) \ preim,(V) 
iii) We have: 
a€ preimg(UC) + ¢(a) e UC 
 Vijez(P(@) € Cj) 
= Vies(a € preimy(C})) 
= a € Use; preimy (C5) 


Similarly, we get preimg((]C) = ()j<7 preimg(C;). 
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3.2 Equivalence relations 


Definition. Let M be a set and let ~ be a relation such that the following conditions are 
satisfied: 


i) reflexivity: Vm € M:m~m; 
ii) symmetry: Vm,n€M:imn~nen~n; 
iii) transitivity: Vm,n,pEe M:(m~nAn~p)>m~p. 
Then ~ is called an equivalence relation on M. 
Example 3.5. Consider the following wordy examples. 


a) p ~ q :— p is of the same opinion as qg. This relation is reflexive, symmetric and 


transitive. Hence, it is an equivalence relation. 


b) p~q: pis asibling of g. This relation is symmetric and transitive but not reflexive 
and hence, it is not an equivalence relation. 


c) p~ q: pis taller qg. This relation is transitive, but neither reflexive nor symmetric 
and hence, it is not an equivalence relation. 


d) p~q: pis in love with g. This relation is generally not reflexive. People don’t like 
themselves very much. It is certainly not normally symmetric, which is the basis of 
much drama in literature. It is also not transitive, except in some French films. 


Definition. Let ~ be an equivalence relation on the set M. Then, for any m € M, we 
define the set: 
[m] := {ne M|m~n} 


called the equivalence class of m. Note that the condition m ~ n is equivalent to n ~ m 
since ~ is symmetric. 


The following are two key properties of equivalence classes. 


Proposition 3.6. Let ~ be an equivalence relation on M. Then: 


i) a€ [m] = [a] = [mr]; 
it) either [m] = [n] or [m| NO [n] = 2. 
Proof. i) Since a € [m], we have a ~ m. Let x € [a]. Then x ~ a and hence xz ~ m by 


transitivity. Therefore x € [m] and hence [a] C [m]. Similarly, we have [m] C [a] and 
hence [a] = [m]. 


ii) Suppose that [m]N [n] 4 @. That is: 


dz:z€ [m]Az€é [nl]. 


Thus z ~ mand z ~ n and hence, by symmetry and transitivity, m ~ n. This implies 
that m € [n] and hence that [m] = [n]. 
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Definition. Let ~ be an equivalence relation on M. Then we define the quotient set of M 
by ~ as: 
M/~ := {[m] | me M}. 


This is indeed a set since [m] C P(M) and hence we can write more precisely: 

M/~ :={[m] © P(M) | me M}. 
Then clearly M/~ is a set by the power set axiom and the principle of restricted compre- 
hension. 


Remark 3.7. Due to the axiom of choice, there exists a complete system of representatives 
for ~, ie. a set R such that R Sse, M/~. 


Remark 3.8. Care must be taken when defining maps whose domain is a quotient set if one 
uses representatives to define the map. In order for the map to be well-defined one needs 
to show that the map is independent of the choice of representatives. 


Example 3.9. Let M = Z and define ~ by: 
mrvynisn—m é 2Z. 


It is easy to check that ~ is indeed an equivalence relation. Moreover, we have: 


and: 


[1] = [3] = [5] =--- = [-1] = [-3] 
Thus we have: Z/~ = {[0],[1]}. We wish to define an addition @ on Z/~ by inheriting 
the usual addition on Z. As a tentative definition we could have: 


OE A ee a ee ES 


being given by: 
[a] © B= [a + 9. 


However, we need to check that our definition does not depend on the choice of class 
representatives, i.e. if [a] = [a’] and [b] = [0], then we should have: 


[a] © [b] = [a] o [8]. 


Indeed, [a] = [a’] and [}] = [b’] means a — a’ € 2Z and b— b' € 2Z, ie. a— a’ = 2m and 
b— 0b! = 2n for some m,n € Z. We thus have: 


[a’ +b] = [a- 2m+b-— 2n] 
= [a +b) — 20+ n) 
= [a+], 
where the last equality follows since: 
(a+ b) — 2(m+n) — (a+b) = —2(m+n) € 2Z. 
Therefore [a’] & [b’] = [a]  [b] and hence the operation © is well-defined. 
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Example 3.10. As a counterexample, with the same set-up as in the previous example, let 


us define an operation « by: 


3.3. Construction of N, Z, Q and R 
Recall that, invoking the axiom of infinity, we defined: 
Nie PU, Ted Bese 


where: 

0:= 9, L:= {OG}, 2={{S}}, 3:= {{{}H}, 
We would now like to define an addition operation on N by using the axioms of set theory. 
We will need some preliminary definitions. 


Definition. The successor map S on N is defined by: 


S:NON 


Example 3.11. Consider S(2). Since 2 := {{@}}, we have $(2) = {{{@}}} =: 3. Therefore, 


we have S(2) = 3 as we would have expected. 


To make progress, we also need to define the predecessor map, which is only defined 


on the set N* :=N\ {@}. 
Definition. The predecessor map P on N* is defined by: 
P:N* ON 
nm tym such that men. 
Example 3.12. We have P(2) = P({{@}}) ={@} =1. 
Definition. Let n € N. The n-th power of S, denoted S”, is defined recursively by: 
S605 aire N 
S° := idy. 
We are now ready to define addition. 


Definition. The addition operation on N is defined as a map: 


+:NxNON 


(m,n) mt+n:= S"(m). 
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Example 3.13. We have: 
241=s1(2)=S(2)=3 


and: 


LoS (y= Ss) = sis) Sse) =3: 


Using this definition, it is possible to show that + is commutative and associative. The 
neutral element of + is 0 since: 


m+0=8°(m) = idy(m) =m 


and: 


0+m=sS™(0) = SPM(1) = SPC) (2) = .-- = $°(m) =m. 


Clearly, there exist no inverses for + in N, i.e. given m € N (non-zero), there exist no n € N 
such that m+n = 0. This motivates the extension of the natural numbers to the integers. 
In order to rigorously define Z, we need to define the following relation on N x N. 


Definition. Let ~ be the relation on N x N defined by: 
(m,n) ~ (p,q) <> m+q=ptn. 
It is easy to check that this is an equivalence relation as: 
i) (m,n) ~ (m,n) since m+n=mM+nN; 
ii) (m,n) ~ (p,q) => (p,q) ~ (m,n) sinceem+qaptnept+n=m+q 
iii) ((m,n) ~ (p,q) A (p,q) ~ (7, 8)) = (m,n) ~ (r,s) since we have: 


m+q=pt+nApts=rt+q, 


hencem+qt+pt+s=p+n+r4+q, and thuym+s=r-+n. 
Definition. We define the set of integers by: 
Z:=(NxN)/~. 


The intuition behind this definition is that the pair (m,n) stands for “m—n”. In other 
words, we represent each integer by a pair of natural numbers whose (yet to be defined) 
difference is precisely that integer. There are, of course, many ways to represent the same 
integer with a pair of natural numbers in this way. For instance, the integer —1 could be 
represented by (1,2), (2,3), (112,113), ... 

Notice however that (1,2) ~ (2,3), (1,2) ~ (112,113), etc. and indeed, taking the 
quotient by ~ takes care of this “redundancy”. Notice also that this definition relies entirely 
on previously defined entities. 


ye 


Remark 3.14. In a first introduction to set theory it is not unlikely to find the claim that the 
natural numbers are part of the integers, i.e. N C Z. However, according to our definition, 
this is obviously nonsense since N and Z := (N x N)/~ contain entirely different elements. 
What is true is that N can be embedded into Z, i.e. there exists an inclusion map t, given 


by: 


uNoZ 
n +> [(n,0)] 


and it is in this sense that N is included in Z. 
Definition. Let n := [(n,0)] € Z. Then we define the inverse of n to be —n := [(0,7)]. 
We would now like to inherit the + operation from N. 


Definition. We define the addition of integers +z: Z x Z > Z by: 


[(m, n)] +z [(p, @)] == [(m + p, n+ q)]- 


Since we used representatives to define +z, we would need to check that +z is well- 
defined. It is an easy exercise. 


Example 3.15. 2 +z (—3) := [(2,0)] +z [(0,3)] 


[(2, 3)] = [(0, 1)] =: —1. Hallelujah! 


In a similar fashion, we define the set of rational numbers by: 
Q=(Zx2")/~, 
where Z* := Z \ {0} and ~ is a relation on Z x Z* given by: 
(p,q) ~ (1,8) + ps = qr, 
assuming that a multiplication operation on the integers has already been defined. 


Example 3.16. We have (2,3) ~ (4,6) since 2x 6=12=3x 4. 


Similarly to what we did for the integers, here we are representing each rational number 
by the collection of pairs of integers (the second one in each pair being non-zero) such that 
their (yet to be defined) ratio is precisely that rational number. Thus, for example, we 


have: 
2 


ae [(233)) = [4s 6) Sea 


We also have the canonical embedding of Z into Q: 


L:Z GQ 
p+ [(p,1)] 


Definition. We define the addition of rational numbers +g: Q x Q— Q by: 


[(p, )] +e [(r, s)] == [(ps + rq, 4s) 


and multiplication of rational numbers by: 


[(p, @)] -o (tr, 8)] «= [(r, a8)], 


where the operations of addition and multiplication that appear on the right hand sides are 
the ones defined on Z. It is again necessary (but easy) to check that these operations are 


both well-defined. 


There are many ways to construct the reals from the rationals. One is to define a set 


A of almost homomorphisms on Z and hence define: 
R:= @/r%, 


where ~ is a “suitable” equivalence relation on &. 
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4 ‘Topological spaces: construction and purpose 


4.1 Topological spaces 


We will now discuss topological spaces based on our previous development of set theory. 
As we will see, a topology on a set provides the weakest structure in order to define the 
two very important notions of convergence of sequences to points in a set, and of continuity 
of maps between two sets. The definition of topology on a set is, at first sight, rather 
abstract. But on the upside it is also extremely simple. This definition is the result of a 
historical development, it is the simplest definition of topology that mathematician found 
to be useful. 


Definition. Let M be a set. A topology on M is a set O C P(M) such that: 
i) 3€Oand MEO 
ii) {U,V} COS /(){U,V} €O; 
iii) CCOSUCEO. 


The pair (M,Q) is called a topological space. If we write “let M be a topological space” 
then some topology O on M is assumed. 


Remark 4.1. Unless |M| = 1, there are (usually many) different topologies O that one can 
choose on the set M. 


Number of 
[M| topologies 
1 1 

2 4 

3 29 

4 355 

5 6,942 

6 209,527 

7 9,535,241 


Example 4.2. Let M = {a,b,c}. Then O = {@, {a}, {b}, {a, b}, {a, b, c}} is a topology on 
M since: 


i) @€Oand MEO 


ii) Clearly, for any S € O, (){9,S} = @ € O and (){S,M} = S € O. Moreover, 
{a} Nn {b} =@ EO, {a} {a,b} = {a} € O, and {b} 1 {a,b} = {b} € O; 


iii) Let CC O. If ME C, then UC = M e€ O. If {a,b} € C (or {a}, {b} € C) but 
M ¢C, then UC = {a,b} € O. If either {a} € C or {b} € C, but {a,b} ¢ C and 
M €C, then UC = {a} € O or UC = {b} € O, respectively. Finally, if none of the 
above hold, then JC = @ € O. 
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Example 4.3. Let M be a set. Then O = {@, M} is a topology on M. Indeed, we have: 
i) @€Oand MEO; 
ii) (){2,9} =9€0O, ()\{9@,M}=GeO,andf\{M,Ms}=MeO; 
iii) If M eC, then UC =M € O, otherwise UC = 9 € O. 
This is called the chaotic topology and can be defined on any set. 
Example 4.4. Let M be a set. Then O = P(M) is a topology on M. Indeed, we have: 
i) @€ P(M) and M € P(M); 
ii) If U,V € P(M), then () {U,V} C M and hence () {U,V} € P(M); 
iii) If CC P(M), then UC C M, and hence UC € P(M). 
This is called the discrete topology and can be defined on any set. 
We now give some common terminology regarding topologies. 


Definition. Let O, and O2 be two topologies on a set M. If O, C Oo, then we say that 
O; is a coarser (or weaker) topology than O2. Equivalently, we say that O2 is a finer (or 
stronger) topology than Oj. 


Clearly, the chaotic topology is the coarsest topology on any given set, while the discrete 
topology is the finest. 


Definition. Let (IV, 0) be a topological space. A subset S of M is said to be open (with 
respect to O) if S € O and closed (with respect to O) if M\ SEO. 


Notice that the notions of open and closed sets, as defined, are not mutually exclusive. 
A set could be both or neither, or one and not the other. 


Example 4.5. Let (M,O) be a topological space. Then @ is open since @ € O. However, 
@ is also closed since M \ @ = M € O. Similarly for M. 


Example 4.6. Let M = {a,b,c} and let O = {@, {a}, {a,b}, {a,b,c}}. Then {a} is open 


but not closed, {b,c} is closed but not open, and {b} is neither open nor closed. 


We will now define what is called the standard topology on R%, where: 


R?:=RxRx-:--xR. 
—_ 


d times 


We will need the following auxiliary definition. 


Definition. For any 2 € R@ and any r € Rt := {s € R| s > 0}, we define the open ball of 
radius r around the point x: 


B,(a) = {y € R¢ | yey — 2)? <r}, 


where x := (21, %2,...,%q) and y := (yi, y2,---, Ya), with z;, y; € R. 
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Remark 4.7. The quantity 4/ (yi — x;)? is usually denoted by ||y — x||2, where || - ||2 is 
the 2-norm on R?. However, the definition of a norm on a set requires the set to be equipped 
with a vector space structure (which we haven’t defined yet), while our construction does 
not. Moreover, our construction can be proven to be independent of the particular norm 
used to define it, i.e. any other norm will induce the same topological structure. 


Definition. The standard topology on R47, denoted Ogta, is defined by: 


U € Ona: Vpe€U:dreRt : B,(p) CU. 


Of course, simply calling something a topology, does not automatically make it into a 
topology. We have to prove that Ogtq as we defined it, does constitute a topology. 


Proposition 4.8. The pair (R¢, Osta) is a topological space. 


Proof. i) First, we need to check whether @ € Ogta, i.e. whether: 


VpE@:ireER*:B,(p)C@ 


is true. This proposition is of the form Vp € @ : Q(p), which was defined as being 
equivalent to: 


Vp: pe SB =>Q(p). 


However, since p € @ is false, the implication is true independent of p. Hence the 
initial proposition is true and thus @ € Oga. 


Second, by definition, we have B,(a) C R% independent of x and r, hence: 


Vp €R?:dre€R*: B,(p) CR? 
is true and thus R? € Oa. 
ii) Let U,V € Ogq and let pe UNV. Then: 
pEUNV:SspEcEUApEeEV 


and hence, since U,V € Osta, we have: 


dr, ER*:B,(p) CU A Are E€R*: B,(p) CV. 
Let r = min{r1,r2}. Then: 

Bp) CBO) Se AK Bp) SB.) CV 
and hence B,(p) CUNV. Therefore UNV € Ogta. 


iii) Let C C Osta and let p € JC. Then, p € U for some U € C and, since U € Osta, we 
have: 


réR*t: B(p) CUCLJC. 


Therefore, Ogtq is indeed a topology on R?¢. 
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4.2 Construction of new topologies from given ones 

Proposition 4.9. Let (M,O) be a topological space and let N C M. Then: 
O|ln :={UNN|UEO}C P(N) 

is a topology on N called the induced (subset) topology. 


Proof. i) Since @ € O and 9 = BONN, we have @ € Oly. Similarly, we have M € O 
and N = MN, and thus N € O|n. 


ii) Let U,V € O|y. Then, by definition: 


ASEO:U=SNN A ATEO:V=aTNON. 


We thus have: 
UNV=(SNN)N(TON) =(SNT)OAN. 
Since S,T € O and O is a topology, we have SMT € O and hence UNV € Oly. 


iii) Let C := {Sq | a € A} C Oly. By definition, we have: 


VaEe A: JU, € 0:8, =U4NN. 


Then, using the notation: 
LU Se :=(JC = {Se | ae A} 
acA 


and De Morgan’s law, we have: 
UY) See Gen) = ( U Ua) ON. 
acA acA acA 


Since O is a topology, we have U,<4 Ua € O and hence UC € Oly. 
Thus O|y is a topology on N. 


Example 4.10. Consider (R, Osta) and let: 
N=[-1,1) :={z¢eR|-l<a2<l}. 


Then (N, Osta|) is a topological space. The set (0, 1] is clearly not open in (R, Osa) since 
(0,1] ¢ Osta. However, we have: 


where (0,2) € Osta and hence (0, 1] € Ostal|y, i-e. the set (0,1] is open in (N, Ostaln). 
Definition. Let (/,O) be a topological space and let ~ be an equivalence relation on M. 
Then, the quotient set: 

M/~ ={|[m] € P(M) | me M} 
can be equipped with the quotient topology Ojy/~ defined by: 


Our = {UE M/~|JU= UL [al € OF. 
[a]JeU 
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An equivalent definition of the quotient topology is as follows. Let gq: M— M/~ be 
the map: 


q: M—> M/~ 


m +> [m] 


Then we have: 


Op = {U € M/~ | preim,(U) € O}. 


Example 4.11. The circle (or 1-sphere) is defined as the set S! := {(a, y) € R? | 2?+y? = 1} 
equipped with the subset topology inherited from R?. The open sets of the circle are (unions 
of) open arcs, i.e. arcs without the endpoints. Individual points on the circle are clearly 
not open since there is no open set of R? whose intersection with the circle is a single point. 
However, an individual point on the circle is a closed set since its complement is an open 
arc. 


An alternative definition of the circle is the following. Let ~ be the equivalence relation 


on R defined by: 


avy:sdneZ:x=yst2rn. 
Then the circle can be defined as the set S$! := R/~ equipped with the quotient topology. 


Definition. Let (A,O,) and (B,Og) be topological spaces. Then the set O4y pg defined 
implicitly by: 


U € Oa4gxp :eVpEeUu:54(S,T)€O4xOp:SxTCU 
is a topology on A x B called the product topology. 


Remark 4.12. This definition can easily be extended to n-fold cartesian products: 


U € Oaucag 2 Vp € Ut a Siy ss Sa) © O4y &  ® Oa St K 22% Sy CU. 


Remark 4.13. Using the previous definition, one can check that the standard topology on 
R? satisfies: 


Osta = ORxRx--xR: 
an — 


d times 


Therefore, a more minimalistic definition of the standard topology on R¢ would consist in 
defining Ogtq only for R (i.e. d = 1) and then extending it to R¢ by the product topology. 


4.3. Convergence 


Definition. Let M be a set. A sequence (of points) in M is a function gq: N > M. 


Definition. Let (M,O) be a topological space. A sequence q in M is said to converge 
against a limit point a € M if: 


VUEO:acUSIANEN:Vn>N: dnc. 
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Remark 4.14. An open set U of M such that a € U is called an open neighbourhood of a. 
If we denote this by U(a), then the previous definition of convergence can be rewritten as: 


VU(a): INEN:Vn>N:q(n) €U. 


Example 4.15. Consider the topological space (M,{@,M}). Then every sequence in MW 
converges to every point in M. Indeed, let gq be any sequence and let a € M. Then, q 


converges against a if: 


VU €{8,M}:af€USINEN:VnSN:qnj ecu. 


This proposition is vacuously true for U = @, while for U = M we have q(n) © M 
independent of n. Therefore, the (arbitrary) sequence q converges to the (arbitrary) point 
ae M. 


Example 4.16. Consider the topological space (/,P(M)). Then only definitely constant 
sequences converge, where a sequence is definitely constant with value c € M if: 


INEN:Vn>N:q(n)=c. 
This is immediate from the definition of convergence since in the discrete topology all 
singleton sets (i.e. one-element sets) are open. 


Example 4.17. Consider the topological space (R%, Osta). Then, a sequence q: N > R? 


converges against a € R¢ if: 


Ve>0:4INEN:Vn>N:|l¢(n) — allo <e. 


Example 4.18. Let M =R and let q=1- sat Then, since q is not definitely constant, it 


is not convergent in (R,P(R)), but it is convergent in (R, Ogta). 


4.4 Continuity 


Definition. Let (V7, Oy,) and (N, On) be topological spaces and let ¢: M — N bea map. 
Then, ¢ is said to be continuous (with respect to the topologies Oj and Oy) if: 


VS €On, preim,(S) € Om, 
where preimy(S) := {m € M : ¢(m) € S} is the pre-image of S under the map @¢. 


Informally, one says that ¢ is continuous if the pre-images of open sets are open. 


Example 4.19. If M is equipped with the discrete topology, or N with the chaotic topology, 
then any map ¢: M > N is continuous. Indeed, let S € On. If Oy = P(M) (and On is 
any topology), then we have: 


preim,(S) ={meM:¢(m)eS}CMeEP(M)=Own. 


If instead On = {@, N} (and Oy, is any topology), then either S = @ or S = N and thus, 
we have: 
preim,(@)=@€ Oy and preimg(N) = M ¢ Oy. 
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Example 4.20. Let M = {a,b,c} and N = {1,2,3}, with respective topologies: 


Om = {2, {b}, {a, c}, {a, b, ch} and Oy = {2, {2}, {3}, (1, 3}, {2, 3}, {1, 2, 3}}, 


and let ¢: M > N by defined by: 


Then ¢ is continuous. Indeed, we have: 


preim,(@) = 2, preim,({2}) = {a,c}, preim,({3}) = 2, 
preim,({1,3}) = {d}, preim,({2,3}) = {a, c}, preimy({1,2,3}) = {a,6,c}, 


and hence preimg(S’) € Oy for all S € Oy. 


Example 4.21. Consider (R%, Osta) and (R°, Osta). Then ¢: R? > R° is continuous with 
respect to the standard topologies if it satisfies the usual ¢-6 definition of continuity: 


VaeR?:Ve>0:46>0:V0 < |lx—allg < 6: ||d(x) — (a) |l2 <e. 


Definition. Let (17,O,,) and (N, Ow) be topological spaces. A bijection ¢: M > N is 
called a homeomorphism if both ¢: M — N and @~!: N > M are continuous. 


Remark 4.22. Homeo(morphism)s are the structure-preserving maps in topology. 


If there exists a homeomorphism ¢ between (M,Oj,) and (N, Own), 


then ¢ provides a one-to-one pairing of the open sets of M with the open sets of N. 


Definition. If there exists a homeomorphism between two topological spaces (IM, Oj.) and 
(N, On), we say that the two spaces are homeomorphic or topologically isomorphic and we 
write (M, Om) =top (N, Ow). 


Clearly, if (M,On) Stop (N, On), then M Scot N. 
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5 Topological spaces: some heavily used invariants 


5.1 Separation properties 


Definition. A topological space (M,Q) is said to be T1 if for any two distinct points 
pqaeM,pFG 


JU(p) € O: q ¢ U(p). 


Definition. A topological space (IW,Q) is said to be T2 or Hausdorff if, for any two 
distinct points, there exist non-intersecting open neighbourhoods of these two points: 


Vp,qe€M:p#q=4U(p),V(q) €O: UD) NV(g) =. 


Example 5.1. The topological space (R%, Osa) is T2 and hence also T1. 
Example 5.2. The Zariski topology on an algebraic variety is T1 but not T2. 


Example 5.3. The topological space (V7, {@, M}) does not have the T1 property since for 
any p € M, the only open neighbourhood of pis M and for any other gq 4 p we have gq € M. 
Moreover, since this space is not T1, it cannot be T2 either. 


Remark 5.4. There are many other “T” properties, including a 12/2 property which differs 
from T2 in that the neighbourhoods are closed. 


5.2 Compactness and paracompactness 


Definition. Let (7,0) be a topological space. A set CC P(M) is called a cover (of M) 


if: 
[Jc=m. 
Additionally, it is said to an open cover if C C O. 


Definition. Let C be a cover. Then any subset C C Csuch that C is still a cover, is called 
a subcover. Additionally, it is said to be a finite subcover if it is finite as a set. 


Definition. A topological space (IW, Q) is said to be compact if every open cover has a 
finite subcover. 


Definition. Let (IM, O) be a topological space. A subset N C M is called compact if the 
topological space (NV, O|,,) is compact. 


Determining whether a set is compact or not is not an easy task. Fortunately though, 
for R? equipped with the standard topology Osta, the following theorem greatly simplifies 
matters. 


Theorem 5.5 (Heine-Borel). Let R¢ be equipped with the standard topology Ogtq. Then, a 
subset of R¢ is compact if, and only if, it is closed and bounded. 


A subset S' of R@ is said to be bounded if: 


dreRt:SCB,(0). 
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Remark 5.6. It is also possible to generalize this result to arbitrary metric spaces. A metric 
space is a pair (M,d) where M is a set and d: M x M — R is a map such that for any 
x,y,z € M the following conditions hold: 


i) d(z,y) = 0; 
i) day =O. =a; 
iii) d(x,y) = d(y, x); 
iv) d(x,y) < d(x, z) + d(y, z). 


A metric structure on a set M induces a topology Og on M by: 


UeOg:eVpeUu:sreRt:B,(p) CU, 
where the open ball in a metric space is defined as: 
B,(p) = {@ € M | d(p, x) <r}. 
In this setting, one can prove that a subset S C M of a metric space (IM, d) is compact if, 


and only if, it is complete and totally bounded. 


Example 5.7. The interval [0,1] is compact in (R, Osta). The one-element set containing 
(—1,2) is a cover of [0,1], but it is also a finite subcover and hence [0, 1] is compact from 
the definition. Alternatively, [0,1] is clearly closed and bounded, and hence it is compact 
by the Heine-Borel theorem. 


Example 5.8. The set R is not compact in (R, O,q). To prove this, it suffices to show that 
there exists a cover of R that does not have a finite subcover. To this end, let: 


C:={(n,n+1)|neZ}U{(n+5,n+ 3) | ne Z}. 


This corresponds to the following picture. 


O O 
C { a C) CY 
LU UY er 
> 
=k —1/2 0 1/2 1 R 


It is clear that removing even one element from C' will cause C' to fail to be an open 
cover of R. Therefore, there is no finite subcover of C and hence, R is not compact. 


Theorem 5.9. Let (M,Oy) and (N,Own) be compact topological spaces. Then (M x 
N,Omxn) ts a compact topological space. 


The above theorem easily extends to finite cartesian products. 


Definition. Let (17,0) be a topological space and let C be a cover. A refinement of C is 
a cover R such that: 


VUER:AVEC:UCY. 


es 


Any subcover of a cover is a refinement of that cover, but the converse is not true in 
general. A refinement F is said to be: 


e open if RCO; 
e locally finite if for any p € M there exists a neighbourhood U(p) such that the set: 
{Ue R|UNU(p) # 9} 
is finite as a set. 


Compactness is a very strong property. Hence often times it does not hold, but a 
weaker and still useful property, called paracompactness, may still hold. 


Definition. A topological space (M, O) is said to be paracompact if every open cover has 
an open refinement that is locally finite. 


Corollary 5.10. If a topological space is compact, then it is also paracompact. 


Definition. A topological space (IW, ©) is said to be metrisable if there exists a metric d 
such that the topology induced by d is precisely O, i.e. Og = O. 


Theorem 5.11 (Stone). Every metrisable space is paracompact. 


Example 5.12. The space (R%, Ogtq) is metrisable since Ogtq = Og where d = || - ||z. Hence 
it is paracompact by Stone’s theorem. 


Remark 5.13. Paracompactness is, informally, a rather natural property since every ex- 
ample of a non-paracompact space looks artificial. One such example is the long line (or 
Alexandroff line). To construct it, we first observe that we could “build” R by taking the 
interval [0,1) and stacking countably many copies of it one after the other. Hence, in a 
sense, R is equivalent to Z x [0,1). The long line L is defined analogously as L : w x [0,1), 
where w is an uncountably infinite set. The resulting space L is not paracompact. 


Theorem 5.14. Let (M, Oj.) be a paracompact space and let (N,On) be a compact space. 
Then M x N (equipped with the product topology) is paracompact. 


Corollary 5.15. Let (M,Oj,) be a paracompact space and let (Ni, On,) be compact spaces 
for everyl1<i<n. Then M x Ny x--- x Ny is paracompact. 


Definition. Let (M,QO,,) be a topological space. A partition of unity of M is a set F 
of continuous maps from M to the interval [0,1] such that for each p € M the following 
conditions hold: 


i) there exists U(p) such that the set {f ¢ F | Vax € U(p): f(x) £ 0} is finite; 


ii) Viper f(p) = 1. 


If C is an open cover, then F is said to be subordinate to the cover C if: 


VfEF: AU EC: f(t) #405>cEU. 
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Theorem 5.16. Let (M,Oy,) be a Hausdorff topological space. Then (M,Oj,) is para- 
compact if, and only if, every open cover admits a partition of unity subordinate to that 


cover. 


Example 5.17. Let R be equipped with the standard topology. Then R is paracompact by 
Stone’s theorem. Hence, every open cover of R admits a partition of unity subordinate to 
that cover. As a simple example, consider F = {f, g}, where: 


0 if#<0 1 ifa <0 
faj=<-~* SFOS el - ands ge)=]< la? a0 <e<— 1 
1 if#>1 0 ifa>1 


Then F is a partition of unity of R. Indeed, f,g: R — [0,1] are both continuous, condition 
i) is satisfied since F itself is finite, and we have Vz € R: f(x) + g(x) =1. 
Let C := {(—o0, 1), (0,00)}. Then C is an open cover of R and since: 


f(z) #0S2€ (0,0) and g(r) 40> 2€ (-~,]1), 
the partition of unity F is subordinate to the open cover C’. 


5.3 Connectedness and path-connectedness 


Definition. A topological space (M,Q) is said to be connected unless there exist two 
non-empty, non-intersecting open sets A and B such that M = AUB. 


Example 5.18. Consider (R \ {0}, Ostalia\fo}), ie. R\ {0} equipped with the subset topology 
inherited from R. This topological space is not connected since (—oo,0) and (0,00) are 
open, non-empty, non-intersecting sets such that R \ {0} = (—oo, 0) U (0, ov). 


Theorem 5.19. The interval [0,1] C R equipped with the subset topology is connected. 


Theorem 5.20. A topological space (M,O) is connected if, and only if, the only subsets 
that are both open and closed are @ and M. 


Proof. (=) Suppose, for the sake of contradiction, that there exists U C M such that U 
is both open and closed and U ¢ {@,M}. Consider the sets U and M \ U. Clearly, 
we have UM M\U =. Moreover, M \ U is open since U is closed. Therefore, U 
and M \U are two open, non-empty, non-intersecting sets such that M =UUM\U, 
contradicting the connectedness of (M, 0). 


(<) Suppose that (M,Q) is not connected. Then there exist open, non-empty, non- 
intersecting subsets A,B C M such that M = AUB. Clearly, A 4 M, otherwise we 
would have B = @. Moreover, since B is open, A = M \ B is closed. Hence, A is a 
set which is both open and closed and A ¢ {@, M}. 


Definition. A topological space (M,Q) is said to be path-connected if for every pair of 
points p,q € M there exists a continuous curve y: [0,1] — M such that 7(0) = p and 


y(1) =4¢. 
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Example 5.21. The space (R4,Ogtq) is path-connected. Indeed, let p,q € R? and let: 


y(A) = p+ A(q- Pp). 


Then ¥ is continuous and satisfies y(0) = p and (1) = q. 
Example 5.22. Let S := {(,sin(4+)) | 2 € (0,1]} U {(0,0)} be equipped with the subset 


topology inherited from R?. 


y 


The space (S,Osta|s) is connected but not path-connected. 
Theorem 5.23. Jf a topological space is path-connected, then it is also connected. 


Proof. Let (M, QO) be path-connected but not connected. Then there exist open, non-empty, 
non-intersecting subsets A,B C M such that M = AUB. Let p€ A and gq € B. Since 
(M, O) is path-connected, there exists a continuous curve y: [0,1] > M such that 7(0) = p 
and y(1) = q. Then: 


(0, 1] = preim,() = preim,(A U B) = preim,(A) U preim,(B). 


The sets preim,(A) and preim,(B) are both open, non-empty and non-intersecting, con- 
tradicting the fact that [0, 1] is connected. 


5.4 Homotopic curves and the fundamental group 


Definition. Let (17,0) be a topological space. Two curves y,6: [0,1] + M such that: 
(0) = 6(0) and (1) =6(1) 


are said to be homotopic if there exists a continuous map h: [0,1] x [0,1] — M such that 
for all A € [0, 1]: 
h(0,A) = y(A) and A(1, A) = 4()). 


Pictorially, two curves are homotopic if they can be continuously deformed into one 


another. 
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Proposition 5.24. Let y ~ 6: “y and 6 are homotopic”. Then, ~ is an equivalence 
relation. 


Definition. Let (7,0) be a topological space. Then, for every p € M, we define the space 
of loops at p by: 
Ly = {y: [0,1] + M | 7 is continuous and (0) = 7(1)}. 


Definition. Let 2, be the space of loops at p € M. We define the concatenation operation 
*: Ly X Ly > Lp by: 


_ JyQr\) if0<A <5 
(y x 0)(A) “ ifh<r<1 


Definition. Let (M,O) be a topological space. The fundamental group 71(p) of (M,O) at 
p € M is the set: 


m1(p) = Lp/~ = {hl |7€ St; 


where ~ is the homotopy equivalence relation, together with the map 


e: 7 (p) X m1(p) > ™(p) 
(7,6) + [y] © [6] = [y * 6]. 


Remark 5.25. Recall that a group is a pair (G,e) where G is aset ande:GxG—>Gisa 
map (also called binary operation) such that: 


i) Va,b,cE G: (aeb) ec=ae (bec); 


ii) Jee G:VgeG:gee=eceg=g; 


iii) VgEG:ig'eG:geg'!=g'eg=e. 


A group is called abelian (or commutative) if, in addition, ae b = bea for all a,bEG. 
A group isomorphism between two groups (G,e) and (H,°) is a bijection ¢: G > H 
such that: 
Va,beG: d(aeb) = g(a) 0 f(b). 
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If there exists a group isomorphism between (G,e) and (H,o), we say that G and H are 
(group theoretic) isomorphic and we write G orp H. 


The operation e is associative (since concatenation is associative); the neutral element 
of the fundamental group (71(p),¢) is (the equivalence class of) the constant curve 7 


defined by: 
ye: [0,1] — M 
At ¥e(0) = p 
Finally, for each [y] € 7(p), the inverse under e is the element [—7y], where —7+ is defined 
by: 
—7: [0,1] > M 
Ar y(1— A) 

All the previously discussed topological properties are “boolean-valued”, i.e. a topolog- 
ical space is either Hausdorff or not Hausdorff, either connected or not connected, and so 
on. The fundamental group is a “group-valued” property, i.e. the value of the property is 
not “either yes or no”, but a group. 

A property of a topological space is called an invariant if any two homeomorphic spaces 
share the property. A classification of topological spaces would be a list of topological 


invariants such that any two spaces which share these invariants are homeomorphic. As of 


now, no such list is known. 


Example 5.26. The 2-sphere is defined as the set: 
S? := {(a,y,z) CR? | a? +y?4+27 =1} 


equipped with the subset topology inherited from R*. 


The sphere has the property that all the loops at any point are homotopic, hence the 
fundamental group (at every point) of the sphere is the trivial group: 


Vp € 8? : m(p) =1:= {[y]}. 


Example 5.27. The cylinder is defined as C := R x S! equipped with the product topology. 
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A loop in C can either go around the cylinder (i.e. around its central axis) or not. If 
it does not, then it can be continuously deformed to a point (the identity loop). If it does, 
then it cannot be deformed to the identity loop (intuitively because the cylinder is infinitely 
long) and hence it is a homotopically different loop. The number of times a loop winds 
around the cylinder is called the winding number. Loops with different winding numbers 


are not homotopic. 
Moreover, loops with different orientations are also not homotopic and hence we have: 


Vp EC: (m(p), ©) =grp (2, +). 


Example 5.28. The 2-torus is defined as the set T? := $1 x S! equipped with the product 
topology. 


oo 


—— 


A loop in T? can intuitively wind around the cylinder-like part of the torus as well as around 
the hole of the torus. That is, there are two independent winding numbers and hence: 


Vp eT? : m(p) orp ZX Z, 


where Z x Z@ is understood as a group under pairwise addition. 
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6 Topological manifolds and bundles 


6.1 Topological manifolds 


Definition. A paracompact, Hausdorff, topological space (IM, O) is called a d-dimensional 
(topological) manifold if for every point p € M there exist a neighbourhood U(p) and a 
homeomorphism «2: U(p) + «(U(p)) C R¢. We also write dim M = d. 


Intuitively, a d-dimensional manifold is a topological space which locally (i.e. around 
each point) looks like R?. Note that, strictly speaking, what we have just defined are real 
topological manifolds. We could define complex topological manifolds as well, simply by 
requiring that the map x be a homeomorphism onto an open subset of C7. 


Proposition 6.1. Let M be a d-dimensional manifold and let U,V C M be open, with 
UNV #@. Ifx and y are two homeomorphisms 


r:U > «(U) CR? and y:V > y(V) CR®, 
thend=d'. 
This ensures that the concept of dimension is indeed well-defined, i.e. it is the same at 


every point, at least on each connected component of the manifold. 


Example 6.2. Trivially, R¢ is a d-dimensional manifold for any d > 1. The space S' is a 
1-dimensional manifold while the spaces $?, C and T? are 2-dimensional manifolds. 


Definition. Let (IM, O) be a topological manifold and let N C M. Then (N, O|y) is called 
a submanifold of (M,O) if it is a manifold in its own right. 


Example 6.3. The space S$! is a submanifold of R? while the spaces S?, C and T? are 
submanifolds of R?. 


Definition. Let (M,O,j,) and (N,Own) be topological manifolds of dimension m and n, 
respectively. Then, (M x N,Oy xn) is a topological manifold of dimension m+ n called 
the product manifold. 


Example 6.4. We have T? = S$! x S$! not just as topological spaces, but as topological 
manifolds as well. This is a special case of the n-torus: 
TT’ :=S'x Six. «x St, 
ee 
n times 
which is an n-dimensional manifold. 


Example 6.5. The cylinder C = S! x R is a 2-dimensional manifold. 
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6.2. Bundles 


Products are very useful. Very often in physics one intuitively thinks of the product of two 
manifolds as attaching a copy of the second manifold to each point of the first. However, 
not all interesting manifolds can be understood as products of manifolds. A classic example 
of this is the Mébius strip.” 


It looks locally like the finite cylinder S' x [0,1], which we can picture as the circle $1 
(the thicker line in figure) with the finite interval [0,1] attached to each of its points in a 
“smooth” way. The Mobius strip has a “twist”, which makes it globally different from the 


cylinder. 


Definition. A bundle (of topological manifolds) is a triple (F,7,M/) where E and M are 
topological manifolds called the total space and the base space respectively, and 7 is a 


continuous, surjective map 7: E > M called the projection map. 
We will often denote the bundle (E,7,M) by E +> M. 


Definition. Let E “+ M be a bundle and let p € M. Then, F, := preim,({p}) is called 
the fibre at the point p. 


Intuitively, the fibre at the point p € M is a set of points in E (represented below as 
a line) attached to the point p. The projection map sends all the points is the fibre F, to 
the point p. 


Fy 


2The TikZ code for the Mébius strip was written by Jake on TeX.SE. 
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Example 6.6. A trivial example of a bundle is the product bundle. Let M and N be 
manifolds. Then, the triple (MZ x N,7,M), where: 
am: MxN->M 
(p,q) > p 
is a bundle since (one can easily check) 7 is a continuous and surjective map. Similarly, 
(M x N,x,N) with the appropriate 7, is also a bundle. 


Example 6.7. In a bundle, different points of the base manifold may have (topologically) 
different fibres. For example, consider the bundle E +» R where: 


S! fp <0 
Fy := preim,({p}) Stop 4 {p} if p=0 
[0,1] ifp>0 


Definition. Let E +» M be a bundle and let F be a manifold. Then, E +> M is called a 
fibre bundle, with (typical) fibre F’, if: 


Vp € M: preim, ({p}) Stop F. 
A fibre bundle is often represented diagrammatically as: 


F —>E 


: 


Example 6.8. The bundle M x N ++ M is a fibre bundle with fibre F := N. 


Example 6.9. The Mébius strip is a fibre bundle E +> S$", with fibre F := [0,1], where 
E#S' x [0,1], ie. the Mébius strip is not a product bundle. 


Example 6.10. A C-line bundle over M is the fibre bundle (E,7,M) with fibre C. Note 
that the product bundle (M x C,7z, M) is a C-line bundle over M, but a C-line bundle over 
M need not be a product bundle. 


Definition. Let E “+ M bea bundle. A map a: M > E is called a (cross-)section of the 
bundle if toa = idy. 


Intuitively, a section is a map o which sends each point p € M to some point o(p) in 
its fibre F,,, so that the projection map 7 takes o(p) € F, C E back to the point p € M. 


Example 6.11. Let (M x F,a,M) be a product bundle. Then, a section of this bundle is a 
map: 


o:Mo>MxF 
p+ (p, s(p)) 


where s: M > F is any map. 
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Definition. A sub-bundle of a bundle (F,7, M) is a triple (E’, 7’, M’) where E’ C EF and 
M' C M are submanifolds and a’ := ag. 


Definition. Let (E,2,M) be a bundle and let N C M be a submanifold. The restricted 
bundle (to N) is the triple (E, 2’, N) where: 


a 
MS T|preim,(N) 


nr 


Definition. Let E +> M and E’ ++ M’ be bundles and let u: E > E’ and v: M > M’ 
be maps. Then (u,v) is called a bundle morphism if the following diagram commutes: 


ie. ifm ou=vorn. 
If (u,v) and (u,v’) are both bundle morphisms, then v = v’. That is, given u, if there 


exists v such that (u,v) is a bundle morphism, then v is unique. 


Definition. Two bundles E + M and E! 7+ M’ are said to be isomorphic (as bundles) 
if there exist bundle morphisms (u,v) and (u~!, v~!) satisfying: 


ue / 
1 > Fy 
wl 


Tv or’ 


2 n 
M 3 > M 
yal 


Such a (u,v) is called a bundle isomorphism and we write E 7) M “ba E’ ae MM’. 


Bundle isomorphisms are the structure-preserving maps for bundles. 
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Definition. A bundle E + M is said to be locally isomorphic (as a bundle) to a bundle 


or’ 


E’ —+ M’ if for all p € M there exists a neighbourhood U(p) such that the restricted 


bundle: 


preim, (U(p)) “22, 17(p) 


a’ 


is isomorphic to the bundle E’ —> M’. 
Definition. A bundle E +» M is said to be: 


i) trivial if it is isomorphic to a product bundle; 


ii) locally trivial if it is locally isomorphic to a product bundle. 


Example 6.12. The cylinder C is trivial as a bundle, and hence also locally trivial. 
Example 6.13. The Mobious strip is not trivial but it is locally trivial. 
From now on, we will mostly consider locally trivial bundles. 


Remark 6.14. In quantum mechanics, what is usually called a “wave function” is not a 
function at all, but rather a section of a C-line bundle over physical space. However, if we 
assume that the C-line bundle under consideration is locally trivial, then each section of 
the bundle can be represented (locally) by a map from the base space to the total space 
and hence it is appropriate to use the term “wave function”. 


Definition. Let E ++ M be a bundle and let f: M’ + M be a map from some manifold 


/ 


M'. The pull-back bundle of E ++ M induced by f is defined as E’ ++ M’, where: 
B= {(ml,e) €M’x B| f(m’) = 7(6)} 
and 2'(m’',e) :=m’. 


If E’ ae M’ is the pull-back bundle of E +» M induced by f, then one can easily 
construct a bundle morphism by defining: 


ur EI oE 
(m',e)He 
This corresponds to the diagram: 


aw 


| 
M' — > M 


/ 


Remark 6.15. Sections on a bundle pull back to the pull-back bundle. Indeed, let E’ “+ M’ 
be the pull-back bundle of E + M induced by f. 
E! 


E 
o’ ar! OOF, oO f 


mi—_!.mM 
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If o is a section of E +» M, then oo f determines a map from M’ to E which sends each 
m’' € M' to o(f(m’)) € E. However, since o is a section, we have: 


m(o(f(m')) = (moe f)(m’) = (id of)(m’) = f(m’) 
and hence (m’,(a 0 f)(m’)) € E’ by definition of E’. Moreover: 
m'(m', (a0 f)(m')) = m' 
and hence the map: 
a’: M'> E’ 
m! ++ (m', (a0 f)(m’)) 


ra 


satisfies 1’ 0 o/ = idjy and it is thus a section on the pull-back bundle E’ —> M’. 


6.3. Viewing manifolds from atlases 


Definition. Let (M,O) be a d-dimensional manifold. Then, a pair (U,x) where U € O 
and «: U + x(U) C R? is a homeomorphism, is said to be a chart of the manifold. 


The component functions (or maps) of : U - 2(U) C R@ are the maps: 
z’:U>R 
p ++ proj;(x(p)) 


for 1 <i < d, where proj;(x(p)) is the i-th component of x(p) € R¢. The 2x*(p) are called 
the co-ordinates of the point p € U with respect to the chart (U, 2). 


Definition. An atlas of a manifold M is a collection # := {(Ua, rq) | a € A} of charts 


such that: 
LJ Ua = M. 
acA 


Definition. Two charts (U,z) and (V,y) are said to be C°-compatible if either UNV = @ 
or the map: 


yor t:2(UNV) > yUNV) 
is continuous. 


Note that yoa~! is a map from a subset of R®@ to a subset of R?. 


UNVCM 
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Since the maps x and y are homeomorphisms, the composition map yo 2! is also a 
homeomorphism and hence continuous. Therefore, any two charts on a topological manifold 
are C°-compatible. This definition my thus seem redundant since it applies to every pair of 
charts. However, it is just a “warm up” since we will later refine this definition and define 
the differentiability of maps on a manifold in terms of C*-compatibility of charts. 


Remark 6.16. The map yo x7! (and its inverse x 0 y~') is called the co-ordinate change 


map or chart transition map. 
Definition. A C°-atlas of a manifold is an atlas of pairwise C°-compatible charts. 
Note that any atlas is also a C°-atlas. 


Definition. A C°-atlas &/ is said to be a maximal atlas if for every (U,x) € &, we have 
(V,y) € & for all (V,y) charts that are C°-compatible with (U, 2). 


Example 6.17. Not every C°-atlas is a maximal atlas. Indeed, consider (R,Ogtq) and the 
atlas of := (R,idg). Then & is not maximal since ((0,1),idg) is a chart which is C°- 
compatible with (R,idg) but ((0,1),idp) ¢ #. 

We can now look at “objects on” topological manifolds from two points of view. For 
instance, consider a curve on a d-dimensional manifold M, i.e. amap y: R- M. We now 
ask whether this curve is continuous, as it should be if models the trajectory of a particle 
on the “physical space” M. 

A first answer is that y: R— M is continuous if it is continuous as a map between the 
topological spaces R and M. 

However, the answer that may be more familiar to you from undergraduate physics is 
the following. We consider only a portion (open subset U) of the physical space M and, 
instead of studying the map y: preim,(U) — U directly, we study the map: 


xo: preim,(U) > (VU) C R¢, 


where (U,x) is a chart of M. More likely, you would be checking the continuity of 
the co-ordinate maps x’ o y, which would then imply the continuity of the “real” curve 


y: preim,(U) — U (real, as opposed to its co-ordinate representation). 
y(U) C R4 
yor i. 
preim,(U) CR SS CM you! 
roy eZ 
(UV) C R?é 


At some point you may wish to use a different “co-ordinate system” to answer a different 
question. In this case, you would chose a different chart (U,y) and then study the map yoy 
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or its co-ordinate maps. Notice however that some results (e.g. the continuity of y) obtained 
in the previous chart (U, x) can be immediately “transported” to the new chart (U, y) via the 
chart transition map yoa~!. Moreover, the map yo z~! allows us to, intuitively speaking, 
forget about the inner structure (i.e. U and the maps y, x and x) which, in a sense, is the 
real world, and only consider preim,(U) C R and x(U), y(U) C R? together with the maps 
between them, which is our representation of the real world. 


= Ay = 


7 Differentiable structures: definition and classification 


7.1 Adding structure by refining the (maximal) atlas 


We saw previously that for a topological manifold (M,QO), the concept of a C°-atlas was 
fully redundant since every atlas is also a C°-atlas. We will now generalise the notion of a 
C°-atlas, or more precisely, the notion of C°-compatibility of charts, to something which is 


non-trivial and non-redundant. 


Definition. An atlas .&/ for a topological manifold is called a ®-atlas if any two charts 
(U,x),(V,y) € & are &-compatible. 


In other words, either UMN V = @ or if UNV ¥ @, then the transition map yo 2! 
from «(UN V) to y((UNV) must be ®. 


UNVCM 


HUY) eRe yUOV) CRM 


yor t 


Before you think Dr Schuller finally went nuts, the symbol ® is being used as a placeholder 
for any of the following: 


e ® =C°: this just reduces to the previous definition; 


e ®=C*: the transition maps are k-times continuously differentiable as maps between 


open subsets of R#™™. 


e ® = C@: the transition maps are smooth (infinitely many times differentiable); equiv- 
alently, the atlas is C* for all k > 0; 


e & =C”: the transition maps are (real) analytic, which is stronger than being smooth; 


e ® = complex: if dimM is even, M is a complex manifold if the transition maps 
are continuous and satisfy the Cauchy-Riemann equations; its complex dimension is 
5 dim M. 


As an aside, if you haven’t met the Cauchy-Riemann equations yet, recall than since 
IR? and C are isomorphic as sets, we can identify the function 


f: R? +R? 
(x,y) > (u(x, y), v(@, y)) 
where u,v: R? > R, with the function 


f: ¢C 3C 
xt+iyrs u(x,y) + iv(z,y). 
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If wu and v are real differentiable at (xo, yo), then f = u+ iv is complex differentiable at 
Zo = Xo + iyo if, and only if, u and v satisfy 


0 0 O O 
Dg (tO Y0) = 5, (HosYo) AF (oro) = —5 (0, Yo): 


which are known as the Cauchy-Riemann equations. Note that differentiability in the 

complex plane is a much stronger condition than differentiability over the real numbers. If 

you want to know more, you should take a course in complex analysis or function theory. 
We now go back to manifolds. 


Theorem 7.1 (Whitney). Any maximal C*-atlas, with k > 1, contains a C®-atlas. More- 
over, any two maximal C*-atlases that contain the same C®-atlas are identical. 


An immediate implication is that if we can find a C!-atlas for a manifold, then we 
can also assume the existence of a C™-atlas for that manifold. This is not the case for 
topological manifolds in general: a space with a C°-atlas may not admit any C!-atlas. But 
if we have at least a C!-atlas, then we can obtain a C®-atlas simply by removing charts, 
keeping only the ones which are C°-compatible. 

Hence, for the purposes of this course, we will not distinguish between C* (k > 1) and 
C°°-manifolds in the above sense. 

We now give the explicit definition of a C*-manifold. 


Definition. A C*-manifold is a triple (M,O,.o/), where (M, 0) is a topological manifold 
and o&/ is a maximal C*-atlas. 


Remark 7.2. A given topological manifold can carry different incompatible atlases. 


Note that while we only defined compatibility of charts, it should be clear what it 
means for two atlases of the same type to be compatible. 


Definition. Two ®-atlases o&, Z are compatible if their union & U Z is again a ®-atlas, 
and are incompatible otherwise. 


Alternatively, we can define the compatibility of two atlases in terms of the compati- 
bility of any pair of charts, one from each atlas. 


Example 7.3. Let (M,O) = (R, Osta). Consider the two atlases & = {(R,idg)} and 
Z={(R,x)}, where x: at Va. Since they both contain a single chart, the compatibility 
condition on the transition maps is easily seen to hold (in both cases, the only transition 
map is idp). Hence they are both C®-atlases. 

Consider now & UY. The transition map idpoxr! is the map a +> a®, which is 
smooth. However, the other transition map, x o idy's is the map x, which is not even 
differentiable once (the first derivative at 0 does not exist). Consequently, & and ¥ are 
not even C!-compatible. 

The previous example shows that we can equip the real line with (at least) two different 
incompatible C°-structures. This looks like a disaster as it implies that there is an arbitrary 
choice to be made about which differentiable structure to use. Fortunately, the situation is 
not as bad as it looks, as we will see in the next sections. 
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7.2 Differentiable manifolds 


Definition. Let ¢: M > N be a map, where (M,Oy,.%,) and (N,On, Hy) are Ch- 
manifolds. Then ¢ is said to be (C*-) differentiable at p € M if for some charts (U,x) € Ay 
with p € U and (V,y) € Wy with ¢(p) € V, the map yo do 27! is k-times continuously 
differentiable at x(p) € 2(U) C R*™™ in the usual sense. 


b 


UCM >VCN 


a(U) c Rdim M yooox y(V) C RdimN 


The above diagram shows a typical theme with manifolds. We have a map ¢: M > N 
and we want to define some property of ¢ at p € M analogous to some property of maps 
between subsets of R?. What we typically do is consider some charts (U,x) and (V,y) as 
above and define the desired property of ¢ at p € U in terms of the corresponding property 
of the downstairs map yo¢o2! at the point a(p) € R¢. 

Notice that in the previous definition we only require that some charts from the two 
atlases satisfy the stated property. So we should worry about whether this definition de- 
pends on which charts we pick. In fact, this “lifting” of the notion of differentiability from 
the chart representation of ¢ to the manifold level is well-defined. 


Proposition 7.4. The definition of differentiability is well-defined. 


Proof. We want to show that if yo ¢o 7! is differentiable at x(p) for some (U,2) € Ay 
with p € U and (V,y) € “y with d(p) € V, then yo do X~' is differentiable at £(p) for all 
charts (U,Z) € Hy with p € U and (V,y) € By with o(p) € V. 


~ : yodor 1 De “ : 
#(U NT) C Ram M _#PF FY AY) C RdimN 
= y 
Zox 1 = db ae yoy? 
UNUCM >VAVCN 
x y 


cae? : odor! lags : 
a(U NT) C Ram M PF. WY AY) Cc RdmN 


Consider the map %o0 27! in the diagram above. Since the charts (U,x) and (U,%) belong 
to the same C*-atlas 4, by definition the transition map 0 27! is C*-differentiable as a 
map between subsets of R?™™, and similarly for yoy~!. We now notice that we can write: 


yooot |=(Yoyt)o(yodor!)o(ox!)! 


and since the composition of C* maps is still C*, we are done. 
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This proof shows the significance of restricting to C*-atlases. Such atlases only con- 
tain charts for which the transition maps are C*, which is what makes our definition of 
differentiability of maps between manifolds well-defined. 

The same definition and proof work for smooth (C°) manifolds, in which case we talk 


about smooth maps. As we said before, this is the case we will be most interested in. 
Example 7.5. Consider the smooth manifolds (R4, Ogtq, Z) and (R®, Osta; ah), where &%) 
and %) are the maximal atlases containing the charts (R4,idga) and (R% , idga’) respec- 


tively, and let f: R? > R® be a map. The diagram defining the differentiability of f with 


respect to these charts is 


RA » Re 
idga id gt 
Re ida! 0fo(idga)~* wae 


and, by definition, the map f is smooth as a map between manifolds if, and only if, the 
map idpa of © (idga)~' = f is smooth in the usual sense. 

Example 7.6. Let (M,O,/) be a d-dimensional smooth manifold and let (U,x) € &. Then 
a2: U + #(U) C R¢ is smooth. Indeed, we have 


U * > c(U) 


5 ido) 


id, (yy) oxox + 


a(U) CR? > (VU) CR4 


Hence x: U + x(U) is smooth if, and only if, the map id,(q7) ox © a id(v) is smooth 


in the usual sense, which it certainly is. 
The coordinate maps x’ := proj; oz: U > R are also smooth. Indeed, consider the 


diagram 
U a >R 
idpox’oa—1 
g(t) CR? — ep 


Then, x’ is smooth if, and only if, the map 


idpov'or t=z' ox! = proj; 


is smooth in the usual sense, which it certainly is. 
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7.3 Classification of differentiable structures 


Definition. Let 6: M —> N bea bijective map between smooth manifolds. If both # and 


¢ | are smooth, then ¢ is said to be a diffeomorphism. 
Diffeomorphisms are the structure preserving maps between smooth manifolds. 


Definition. Two manifolds (M,Oy,%v), (N,On, Yn) are said to be diffeomorphic if 
there exists a diffeomorphism ¢: M — N between them. We write M gig N. 


Note that if the differentiable structure is understood (or irrelevant), we typically write 
M instead of the triple (M, Oy, Ay). 


Remark 7.7. Being diffeomorphic is an equivalence relation. In fact, it is customary to con- 
sider diffeomorphic manifolds to be the same from the point of view of differential geometry. 
This is similar to the situation with topological spaces, where we consider homeomorphic 
spaces to be the same from the point of view of topology. This is typical of all structure 


preserving maps. 


Armed with the notion of diffeomorphism, we can now ask the following question: how 
many smooth structures on a given topological space are there, up to diffeomorphism? 
The answer is quite surprising: it depends on the dimension of the manifold! 


Theorem 7.8 (Radon-Moise). Let M be a manifold with dim M = 1,2, or 3. Then there 


is a unique smooth structure on M up to diffeomorphism. 


Recall that in a previous example, we showed that we can equip (R, Ogtq) with two 
incompatible atlases .7 and Z. Let M%nax and Amax be their extensions to maximal atlases, 
and consider the smooth manifolds (R, Osta, “Ynax) and (R, Osta, Bmax). Clearly, these are 
different manifolds, because the atlases are different, but since dimR = 1, they must be 
diffeomorphic. 

The answer to the case dim M > 4 (we emphasize dim M # 4) is provided by surgery 
theory. This is a collection of tools and techniques in topology with which one obtains a 
new manifold from given ones by performing surgery on them, i.e. by cutting, replacing and 
gluing parts in such a way as to control topological invariants like the fundamental group. 
The idea is to understand all manifolds in dimensions higher than 4 by performing surgery 
systematically. In particular, using surgery theory, it has been shown that there are only 
finitely many smooth manifolds (up to diffeomorphism) one can make from a topological 
manifold. 

This is not as neat as the previous case, but since there are only finitely many structures, 
we can still enumerate them, i.e. we can write an exhaustive list. 

While finding all the differentiable structures may be difficult for any given manifold, 
this theorem has an immediate impact on a physical theory that models spacetime as a 
manifold. For instance, some physicists believe that spacetime should be modelled as a 
10-dimensional manifold (we are neither proposing nor condemning this view). If that is 
indeed the case, we need to worry about which differentiable structure we equip our 10- 
dimensional manifold with, as each different choice will likely lead to different predictions. 
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But since there are only finitely many such structures, physicists can, at least in principle, 
devise and perform finitely many experiments to distinguish between them and determine 
which is the right one, if any. 

We now turn to the special case dim M = 4. The result is that if M is a non-compact 
topological manifold, then there are uncountably many non-diffeomorphic smooth struc- 
tures that we can equip M with. In particular, this applies to (R*, Osta). 

In the compact case there are only partial results. By way of example, here is one such 
result. 


Proposition 7.9. Jf (M,O), with dim M = 4, has bz > 18, where bg is the second Betti 
number, then there are countably many non-diffeomorphic smooth structures that we can 
equip (M,O) with. 


Betti numbers are defined in terms of homology groups, but intuitively we have: 
e bo is the number of connected components a space has; 

e b; is the number of circular (1-dimensional) holes a space has; 

e b is the number of 2-dimensional holes a space has; 


and so on. Hence if a manifold has a number of 2-dimensional holes greater than 18, then 
there only countably many structures that we can choose from, but they are still infinitely 


many. 


eS fa 


8 Tensor space theory I: over a field 


8.1 Vector spaces 


We begin with a quick review of vector spaces. 


Definition. An (algebraic) field is a triple (K,+,-), where K is a set and +,- are maps 
K x K > K satisfying the following axioms: 


e (K,+) is an abelian group, i.e. 


i) Va,b,c€ K: (a+b) +c=a+(b+0e); 
J0E€ kK :VacekK:a+0=04+a=9; 

dJ-aée K:a+(-a) =(-a)+a=0; 
Vo0€ Ke ob =b+a; 


ist 


ili 


i) 
i) 
i) V 
iv) 
e (K*,-), where K* := K \ {0}, is an abelian group, i.e. 


v) Va,b,ce K*: (a-b)-c=a-(b-c); 
41 e€ K* : Vac K*:a-1=1-a=a,; 
Vwe K* Sas 2k 6a) See ST 


Va,be€ K*:a-b=b-a; 


vii 


) 
i) 
i) 
viii) 
e the maps + and - satisfy the distributive property: 
ix) Va,b,c€ K:(a+b)-c=a-c+b-e. 
Remark 8.1. In the above definition, we included axiom iv for the sake of clarity, but in 


fact it can be proven starting from the other axioms. 


Remark 8.2. A weaker notion that we will encounter later is that of a ring. This is also 
defined as a triple (R,+,-), but we do not require axiom vi, vii and viii to hold. Ifa 
ring satisfies axiom vi, it is called a unital ring, and if it satisfies axiom viii, it is called a 


commutative ring. We will mostly consider unital rings, a call them just rings. 


Example 8.3. The triple (Z,+,-) is a commutative, unital ring. However, it is not a field 
since 1 and —1 are the only two elements which admit an inverse under multiplication. 


Example 8.4. The sets Q, R, C are all fields under the usual + and - operations. 


Example 8.5. An example of a non-commutative ring is the set of real m x n matrices 
Mmxn(R) under the usual operations. 


Definition. Let (K,+,-) be a field. A K-vector space, or vector space over K is a triple 
(V,@,©), where V is a set and 


@VxVrovV 
O:KxVaoV 


are maps satisfying the following axioms: 
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e (V,Q) is an abelian group; 
e the map © is an action of K on (V,@): 


i) VAEK:VuweEV:rAO(vGw) =(AOv) 6(AOw); 

)VA MEK :VvuEV:(A+p) Ov=(AOv) O(wOv); 
iii) VA MEK :VuEV:(A-pOv=AO("Ovd); 

) 


iv) VuEV:1Ov=v. 


ll 


Vector spaces are also called linear spaces. Their elements are called vectors, while 
the elements of K are often called scalars, and the map © is called scalar multiplication. 
You should already be familiar with the various vector space constructions from your linear 
algebra course. For example, recall: 


Definition. Let (V,@,©) be a vector space over K and let U C V be non-empty. Then 
we say that (U, ®luxu, ©|Kxu) is a vector subspace of (V,@, ©) if: 


i) Vuz,u2 €U su, Pug € U; 
ii) VueU:VAEK:AOuE UV. 
More succinctly, if Vu1,ug EU: VAE KK: (AOu) Pug EU. 


Also recall that if we have n vector spaces over K, we can form the n-fold Cartesian 
product of their underlying sets and make it into a vector space over K by defining the 
operations ® and © componentwise. 

As usual by now, we will look at the structure-preserving maps between vector spaces. 


Definition. Let (V,®,©), (W,, ) be vector spaces over the same field K and let f: V > 
W be amap. We say that f is a linear map if for all vj,vg € V and all Ac K 


f(A © v1) © v2) = (AG f(vr)) B f (va). 


From now on, we will drop the special notation for the vector space operations and 


suppress the dot for scalar multiplication. For instance, we will write the equation above 
as f (Av, + v2) = Af (v1) + f(v2), hoping that this will not cause any confusion. 


Definition. A bijective linear map is called a linear isomorphism of vector spaces. Two 
vector spaces are said to be isomorphic is there exists a linear isomorphism between them. 
We write V Svec W. 


Remark 8.6. Note that, unlike what happens with topological spaces, the inverse of a 
bijective linear map is automatically linear, hence we do not need to specify this in the 
definition of linear isomorphism. 


Definition. Let V and W be vector spaces over the same field K. Define the set 
Hom(V,W) :={f | f: V > W}, 


where the notation f: V — W stands for “f is a linear map from V to W”. 
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The hom-set Hom(V,W) can itself be made into a vector space over K by defining: 


©: Hom(V,W) x Hom(V, W) > Hom(V, W) 


(fgefeg 
where 
foeg:V>W 
v ++ (fF @ g)(v) = flv) + g(v), 

and 

©: K x Hom(V,W) > Hom(V, W) 

A, f) AAO fF 

where 

AOf: VOW 


vr (AO f)(v) = Af(v). 


It is easy to check that both f @g and A f are indeed linear maps from V to W. For 
instance, we have: 


(AD f)(uer + v2) = Af (ur + v2) 
= Auf (v1) + f(v2)) 
= Auf(v1) + Af (v2) 
= pAf (vi) + Af (v2) 
= HAS f)(v1) + AS f)(v2) 


by definition) 
since f is linear) 
by axioms i and iii) 


since K is a field) 


( 
( 
( 
( 


so that 1} f € Hom(V,W). One should also check that @ and © satisfy the vector space 


axioms. 


Remark 8.7. Notice that in the definition of vector space, none of the axioms require that 
K necessarily be a field. In fact, just a (unital) ring would suffice. Vector spaces over rings, 
have a name of their own. They are called modules over a ring, and we will meet them 
later. 

For the moment, it is worth pointing out that everything we have done so far applies 
equally well to modules over a ring, up to and including the definition of Hom(V,W). 
However, if we try to make Hom(V,W) into a module, we run into trouble. Notice that 
in the derivation above, we used the fact the multiplication in a field is commutative. But 
this is not the case in general in a ring. 


The following are commonly used terminology. 


Definition. Let V be a vector space. An endomorphism of V is a linear map V > V. We 
write End(V) := Hom(V, V). 
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Definition. Let V be a vector space. An automorphism of V is a linear isomorphism 


VV. We write Aut(V) := {f € End(V) | f is an isomorphism}. 


Remark 8.8. Note that, unlike End(V), Aut(V) is not a vector space as was claimed in 
lecture. It however a group under the operation of composition of linear maps. 


Definition. Let V be a vector space over K. The dual vector space to V is 
V* := Hom(V, kK), 
where K is considered as a vector space over itself. 


The dual vector space to V is the vector space of linear maps from V to the underlying 
field K, which are variously called linear functionals, covectors, or one-forms on V. The 
dual plays a very important role, in that from a vector space and its dual, we will construct 
the tensor products. 

8.2 Tensors and tensor spaces 
Definition. Let V, W, Z be vector spaces over K. A map f: V x W — Z is said to be 
bilinear if 
eVwEeWw:Vu,wEV:VAEK: f(Av, + 02, w) = Af (v1, w) + f (v2, w); 
eVveEV:VuiwEW:VAEK: flv, Aw, + we) = Af (v, w1) + f(v, we); 


ie. if the maps v +> f(v,w), for any fixed w, and w+> f(v,w), for any fixed v, are both 
linear as maps V > Z and W = Z, respectively. 


Remark 8.9. Compare this with the definition of a linear map f: V x W —> Z: 
Va,yEeVxWw:VAEK: f(Art+y) =Af(xz) + fly). 
More explicitly, if x = (v1, w1) and y = (v2, w2), then: 


f(A(1, w1) + (v2, wa)) = AF (1, wi)) + F((v2, we). 
A bilinear map out of V x W is not the same as a linear map out of V x W. In fact, 
bilinearity is just a special kind of non-linearity. 
Example 8.10. The map f: R? > R given by (x,y) 4 2+ is linear but not bilinear, while 
the map (2, y) > zy is bilinear but not linear. 


We can immediately generalise the above to define multilinear maps out of a Cartesian 


product of vector spaces. 


Definition. Let V be a vector space over Kk. A (p,q)-tensor T on V is a multilinear map 


T: Vex xVixVx--- xVok. 
es /— 
p copies q copies 


We write 


TPV :=V®@---@VEOV*®--- OV" :={T| Ti -t V}. 
5 @:-:-@V@V"®:--® {T | T is a (p,q)-tensor on V} 


p copies q copies 
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Definition. A type (p,0) tensor is called a covariant p-tensor, while a tensor of type (0, q) 
is called a contravariant q-tensor. 
Remark 8.11. By convention, a (0,0) on V is just an element of A, and hence TV =K. 


Remark 8.12. Note that to define T?V as a set, we should be careful and invoke the principle 
of restricted comprehension, i.e. we should say where the J's are coming from. In general, 
say we want to build a set of maps f: A > B satisfying some property p. Recall that the 
notation f: A > B is hiding the fact that is a relation (indeed, a functional relation), and 
a relation between A and B is a subset of A x B. Therefore, we ought to write: 


{f € P(Ax B)| f: A> B and p(f)}. 
In the case of T?V we have: 


TPV :=4TEP(V*x---xV*xXVx-:--+-xVxK) | Ti ,q)-t Vt, 
7 { ( , 3 ) | T is a (p,q)-tensor on V} 
p copies q copies 
although we will not write this down every time. 


The set TV can be equipped with a K-vector space structure by defining 


@: TEV x Lv a 
(T,S)4 TOS 


and 


©: KxTPV > TV 
AT) AOT, 


where T @ S and © T are defined pointwise, as we did with Hom(V, W). 
We now define an important way of obtaining a new tensor from two given ones. 


Definition. Let T € T?V and S € T?V. The tensor product of T and S' is the tensor 
T@SETPTV defined by: 


(T @ S)(w1, vey Wy, Wpt1y+++)Wptr, U1, ++, Ug, Ug+ls: +> , Ugts) 


= T(w1, +++) Wy, U15--- ,Uq) S(wp41, sees Wptry, Ugtls ++ +5 Ugts), 
with w; € V* and v; EV. 


Some examples are in order. 


Example 8.13. a) TOV := {fT | T: V — K} = Hom(V,K) =: V*. Note that here 
multilinear is the same as linear since the maps only have one argument. 
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b) TLV = V @V* := {T | T is a bilinear map V* x V > K}. We claim that this is 
the same as End(V*). Indeed, given T’ € V @ V*, we can construct T € End(V*) as 
follows: 


Teves 
wt> T(-,w) 
where, for any fixed w, we have 


T(-,w):V OK 


vr T(v,w). 


The linearity of both T and T(—,w) follows immediately from the bilinearity of T. 
Hence T(—,w) € V* for all w, and T € End(V*). This correspondence is invertible, 
since can reconstruct T from T by defining 


T:VxVeok 
(v,w) 4 T(v,w) = (T(w))(v). 


The correspondence is in fact linear, hence an isomorphism, and thus 
TLV “vec End(V*). 


Other examples we would like to consider are 


2 


c) TPV “vec V: while you will find this stated as true in some physics textbooks, it is 
in fact not true in general; 


2 


d) TLV ae, End(V): This is also not true in general; 


2 


e) (V*)* Svee V: This only holds if V is finite-dimensional. 


The definition of dimension hinges on the notion of a basis. Given a vector space 
Without any additional structure, the only notion of basis that we can define is a so-called 
Hamel basis. 


Definition. Let (V,+,-) be a vector space over kK. A subset B C V is called a Hamel basis 
for V if 


e every finite subset {bi,...,by} of B is linearly independent, i.e. 


N « 
S- d'b; = 0 N=... =r =0; 
i=1 


e Bis a generating or spanning set of V, i.e. 


M 
b1,-..,6u €B:v=> vib. 


i=1 


VveEV:dul,...,.uo"eEK: 


— 59 — 


Remark 8.14. We can write the second condition more succinctly by defining 
m . . 
span, (B) := {30% | NEKAGREBAn> i} 
i=1 
and thus writing V = span; (B). 


Remark 8.15. Note that we have been using superscripts for the elements of K, and these 
should not be confused with exponents. 


The following characterisation of a Hamel basis is often useful. 


Proposition 8.16. Let V be a vector space and B a Hamel basis of V. Then B is a minimal 
spanning and maximal independent subset of V, i.e., if SCV, then 


e span(S) =V => |S| > |B]; 
e S is linearly independent = |5| < |B]. 


Definition. Let V be a vector space. The dimension of V is dimV := |B], where B is a 
Hamel basis for V. 


Even though we will not prove it, it is the case that every Hamel basis for a given 
vector space has the same cardinality, and hence the notion of dimension is well-defined. 


Proposition 8.17. [f dimV < co and S CV, then we have the following: 

e if spang(S) = V and |S| =dimV, then S is a Hamel basis of V; 

e if S is linearly independent and |S| = dimV, then S is a Hamel basis of V. 
Theorem 8.18. /f dimV < ov, then (V*)* vec V. 


Sketch of proof. One constructs an explicit isomorphism as follows. Define the evaluation 
map as 
ev: V > (V*)* 


VEY eVy 


where 


~N 


ev,: Vi OK 


W Ky eVy(W) := w(v) 


The linearity of ev follows immediately from the linearity of the elements of V*, while that 
of evy from the fact that V* is a vector space. One then shows that ev is both injective 


and surjective, and hence an isomorphism. 


Remark 8.19. Note that while we need the concept of basis to state this result (since we 
require dimV < oo), the isomorphism that we have constructed is independent of any 
choice of basis. 
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Remark 8.20. It is not hard to show that (V*)* &yec V implies TPV ~yec V and TLV 2vec 
End(V). So the last two hold in finite dimensions, but they need not hold in infinite 
dimensions. 


Remark 8.21. While a choice of basis often simplifies things, when defining new objects 
it is important to do so without making reference to a basis. If we do define something 
in terms of a basis (e.g. the dimension of a vector space), then we have to check that the 
thing is well-defined, i.e. it does not depend on which basis we choose. Some people say: 
“A gentleman only chooses a basis if he must.” 


If V is finite-dimensional, then V* is also finite-dimensional and V Syec V*. Moreover, 
given a basis B of V, there is a spacial basis of V* associated to B. 


Definition. Let V be a finite-dimensional vector space with basis B = {e1,...,e€gimv}- 
The dual basis to B is the unique basis B’ = {f!,..., {4} of V* such that 


VisuzedmVy ss 76) =.= i — = 
0 ift#j 
Remark 8.22. If V is finite-dimensional, then V is isomorphic to both V* and (V*)*. In 
the case of V*, an isomorphism is given by sending each element of a basis B of V toa 
different element of the dual basis B’, and then extending linearly to V. 

You will (and probably already have) read that a vector space is canonically isomorphic 
to its double dual, but not canonically isomorphic to its dual, because an arbitrary choice 
of basis on V is necessary in order to provide an isomorphism. 

The proper treatment of this matter falls within the scope of category theory, and the 
relevant notion is called natural isomorphism. See, for instance, the book Basic Category 
Theory by Tom Leinster for an introduction to the subject. 


Once we have a basis B, the expansion of v € V in terms of elements of 6 is, in fact, 
unique. Hence we can meaningfully speak of the components of v in the basis B. The notion 


of coordinates can also be generalised to the case of tensors. 


Definition. Let V be a finite-dimensional vector space over K with basis B = {e1,..., €gimv } 
and let T € T?V. We define the components of T in the basis B to be the numbers 


ba — LE eae fei ocehy) ek, 


where 1 < a;,b; < dimV and {f?,..., ft} is the dual basis to B. 


1 ...Ap 
di by... 


Just as with vectors, the components completely determine the tensor. Indeed, we can 
reconstruct the tensor from its components by using the basis: 


dimV dimV 


T= S Sessite S- De tise Q-++@ Cap 4) fet Q---@ fs, 
a,=1 bg=1 


p+q sums 


where the e€,,s are understood as elements of ToV vec V and the fs as elements of 
T)V “vec V*. Note that each summand is a (p,q)-tensor and the (implicit) multiplication 
between the components and the tensor product is the scalar multiplication in T?V. 
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Change of basis 


Let V be a vector space over K with d= dimV < o and let {e1,...,eqg} be a basis of V. 
Consider a new basis {€1,...,é€a}. Since the elements of the new basis are also elements of 
V, we can expand them in terms of the old basis. We have: 


d 
ee forl<i<d. 
j=l 
for some A? , € K. Similarly, we have 
d . 
c=). Be; for 1 <i<d. 
j=l 


for some Bo € K. It is a standard linear algebra result that the matrices A and B, with 


entries A’ , and BI , respectively, are invertible and, in fact, A1=B8. 


8.3. Notational conventions 
Einstein’s summation convention 


From now on, we will employ the Einstein’s summation convention, which consists in sup- 
pressing the summation sign when the indices to be summed over each appear once as a 
subscript and once as a superscript in the same term. For example, we write 


v=v'e; and ji Te; ® ej ® fe 


instead of 


d d dd 
v=) ve and TES. YS DPS e, Or" 
= wie 


Indices that are summed over are called dummy indices; they always appear in pairs and 
clearly it doesn’t matter which particular letter we choose to denote them, provided it 
doesn’t already appear in the expression. Indices that are not summed over are called 
free indices; expressions containing free indices represent multiple expressions, one for each 
value of the free indices; free indices must match on both sides of an equation. For example 

vie; = vex, rar, Aij = CyC* Bij + CiC* Bg 


mm? 
are all valid expressions, while 


cy'e; = cuken, ef, = (ha Ai; — CC" + CC" By 
are not. The ranges over which the indices run are usually understood and not written out. 
The convention on which indices go upstairs and which downstairs (which we have 


already been using) is that: 
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e the basis vectors of V carry downstairs indices; 
e the basis vectors of V* carry upstairs indices; 


e all other placements are enforced by the Einstein’s summation convention. 
For example, since the components of a vector must multiply the basis vectors and be 
summed over, the Einstein’s summation convention requires that they carry upstair indices. 


Example 8.23. Using the summation convention, we have: 
a) f°) =f Oe) =e ie =o Oe a 
b) w(ep) = (wa f”) (ep) = waf*(ep) = Wp} 


c) w(v) = wa f*(v'e,) = wav; 


where v EV, w € V*, {e;} is a basis of V and {f/} is the dual basis to {e;}. 
Remark 8.24. The Einstein’s summation convention should only be used when dealing with 


linear spaces and multilinear maps. The reason for this is the following. Consider a map 
$:V x W => Z, and let v = v’e; € V and w = w'é; € W. Then we have: 


o(v, w) =¢@ ( u'e;, a) = o(v'ei, w!é;) = v'w b(€;, €;). 


Note that by suppressing the greyed out summation signs, the second and third term above 
are indistinguishable. But this is only true if ¢ is bilinear! Hence the summation convention 
should not be used (at least, not without extra care) in other areas of mathematics. 


Remark 8.25. Having chosen a basis for V and the dual basis for V*, it is very tempting to 
think of v = v'e; € V and w = w;f* € V* as d-tuples of numbers. In order to distinguish 
them, one may choose to write vectors as columns of numbers and covectors as rows of 
numbers: 


v=ve —~— vs 


and 

w=uf? om w (w1,...,Wa). 
Given ¢ € End(V) vec T/V, recall that we can write ¢ = e'; e;® f?, where ?'; = (f?, e;) 
are the components of ¢ with respect to the chosen basis. It is then also very tempting to 
think of ¢ as a square array of numbers: 


a a a 
: ; m o, ¢°, ae ¢"4 
PADRES RU RN) Bag. + 
o7, $4, oes $4, 
The convention here is to think of the 7 index on e', as a row index, and of j as a column 
index. 
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We cannot stress enough that this is pure convention. Its usefulness stems from the 


following example. 


Example 8.26. If dimV < ov, then we have End(V) 2vec T}V. Explicitly, if ¢ € End(V), 
we can think of ¢ € TV, using the same symbol, as 


$(w, 0) = w((v)). 


Hence the components of ¢@ € End(V) are $%, := f%(¢(ep)). 
Now consider ¢,w € End(V). Let us determine the components of ¢ 0 w. We have: 


ue = (gou)(f"%, en) 
= f*((¢0 v)(e)) 
= f*((oW(e))) 
= f°(O("y em) 
= vr f"(b(em)) 
= U4 Pm: 
The multiplication in the last line is the multiplication in the field AK, and since that’s 
commutative, we have ~™, 6%, = O°m v"",. However, in light of the convention introduced 
in the previous remark, the latter is preferable. Indeed, if we think of the superscripts as 


row indices and of the subscripts as column indices, then ¢$*,, w~™, is the entry in row a, 


column b, of the matrix product ow. 


T 


Similarly, w(v) = w,v™ can be thought of as the dot product w-v = w* v, and 


d(v, w) = Wa ob, ues wi dv. 


The last expression is could mislead you into thinking that the transpose is a “good” notion, 
but in fact it is not. It is very bad notation. It almost pretends to be basis independent, 
but it is not at all. 

The moral of the story is that you should try your best not to think of vectors, covectors 
and tensors as arrays of numbers. Instead, always try to understand them from the abstract, 


intrinsic, component-free point of view. 


Change of components under a change of basis 


Recall that if {e,} and {é,} are basis of V, we have 
€g = A? ep and eB” Cas 


with A~! = B. Note that in index notation, the equation AB = I reads A®,,B™, = 6). 
We now investigate how the components of vectors and covectors change under a change 


of basis. 
a) Let v = v%eg = U%Eqg € ;V. Then: 


a _ f2(v) = f° (ee) = 0° f*(&) = wo f(A" ea) = Al™ a £7 (Ex) = AY 0°. 
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by Létw Sty p "= Wa f® € V*. Then: 


Summarising, for v € V, w € V* and €, = A? eb, we have: 


y? = Ate Wa = Bae 


w= Biv? Wg = Alay 
The result for tensors is a combination of the above, depending on the type of tensor. 
c) Let TE T?V. Then: 


A1...Ap — aa .. A M1 22. BNq MM1...Mp 
TOO BO geek i Be BT ae 


1.bq 


i.e. the upstair indices transform like vector indices, and the downstair indices trans- 
form like covector indices. 


8.4 Determinants 


In your previous course on linear algebra, you may have met the determinant of a square 
matrix as a number calculated by applying a mysterious rule. Using the mysterious rule, 
you may have shown, with a lot of work, that for example, if we exchange two rows or two 
columns, the determinant changes sign. But, as we have seen, matrices are the result of 
pure convention. Hence, one more polemic remark is in order. 


Remark 8.27. Recall that, if ¢ € T/V, then we can arrange the components g*, in matrix 


form: 
6 by Hy 
2 492 2 
PGES tae ges OT FE 
b4) b8y.<** bt, 


Similarly, if we have g € TV, its components are gap := g(€a; €) and we can write 


911 912 °** Gid 
921 922 °°* Gad 


I> 


9G= 9 f° @f? mm g 
Gd Gd2°** Gdd 
Needless to say that these two objects could not be more different if they tried. Indeed 


e ¢is an endomorphism of V; the first index in @“, transforms like a vector index, while 
the second index transforms like a covector index; 


e g isa bilinear form on V; both indices in gay transform like covector indices. 
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In linear algebra, you may have seen the two different transformation laws for these objects: 
b> A 'dA and Gad oA: 


where A is the change of basis matrix. However, once we fix a basis, the matrix representa- 
tions of these two objects are indistinguishable. It is then very tempting to think that what 
we can do with a matrix, we can just as easily do with another matrix. For instance, if we 
have a rule to calculate the determinant of a square matrix, we should be able to apply it 
to both of the above matrices. 

However, the notion of determinant is only defined for endomorphisms. The only way 
to see this is to give a basis-independent definition, i.e. a definition that does not involve 


the “components of a matrix”. 


We will need some preliminary definitions. 
Definition. Let M be aset. A permutation of M is a bijection M > M. 


Definition. The symmetric group of order n, denoted Sy, is the set of permutations of 


{1,...,} under the operation of functional composition. 


Definition. A transposition is a permutation which exchanges two elements, keeping all 


other elements fixed. 


Proposition 8.28. Every permutation 7 € Sp can be written as a product (composition) 


of transpositions in Sy. 


While this decomposition is not unique, for each given 7 € S;,, the number of transpo- 
sitions in its decomposition is always either even or odd. Hence, we can define the sign (or 


signature) of 7 € Sp, as: 


+1 if 7 is the product of an even number of transpositions 
sgn(™) = 


—1 if ais the product of an odd number of transpositions. 


Definition. Let V be a d-dimensional vector space. An n-form on V is a (0,n)-tensor w 


that is totally antisymmetric, i.e. 
Vm € Sp: w(V1,02,---,Un) = sgn(T) W(Uq(1), Un(2)> +++ Un(n)) 


Note that a 0-form is a scalar, and a 1-form is a covector. A d-form is also called a 


top form. There are several equivalent definitions of n-form. For instance, we have the 


following. 
Proposition 8.29. A (0,n)-tensor w is an n-form if, and only if, w(v1,...,;Un) = 0 when- 
ever {v1,...,Un} is linearly dependent. 


While T°V is certainly non-empty when n > d, the proposition immediately implies 
that any n-form with n > d must be identically zero. This is because a collection of more 


than d vectors from a d-dimensional vector space is necessarily linearly dependent. 


— 66 — 


Proposition 8.30. Denote by A”V the vector space of n-forms on V. Then we have 


d : 
l<n<d 
amaryl’) #lsns 
0 if n> d, 
where () = aonyl is the binomial coefficient, read as “d choose n”. 


In particular, dim A¢V = 1. This means that 


Vw,w € AV :4ceEK: w=cu’, 
i.e. there is essentially only one top form on V, up to a scalar factor. 


Definition. A choice of top form on V is called a choice of volume form on V. A vector 
space with a chosen volume form is then called a vector space with volume. 


This terminology is due to the next definition. 


Definition. Let dim V = d and let w € A¢V be a volume form on V. Given v1,...,vg € V, 
the volume spanned by v1,..., vq is 


Vollurs sys; 0g) = Oli jas.40y) 
Intuitively, the antisymmetry condition on w makes sure that vol(v1,...,vq) is zero 
whenever the set {v1,...,vq} is not linearly independent. Indeed, in that case v1,...,uq 


could only span a (d — 1)-dimensional hypersurface in V at most, which should have 0 
volume. 


Remark 8.31. You may have rightfully thought that the notion of volume would require 
some extra structure on V, such as a notion of length or angles, and hence an inner product. 
But instead, we only need a top form. 


We are finally ready to define the determinant. 


Definition. Let V be a d-dimensional vector space and let ¢ € End(V) 2yec T/V. The 
determinant of ¢ is 


det db = w(b(er), ar) o(ea)) 
w(e1, soe d €d) 
for some volume form w € A¢V and some basis {e1,...,eq} of V. 


The first thing we need to do is to check that this is well-defined. That det ¢ is 
independent of the choice of w is clear, since if w,w’ € A¢V, then there is a c € K such that 


w = cw’, and hence 


w((er), fatg (ea) = éu (d(e1), snes , $(€a)) 


w(€1,..-,€q) éw'(e1,...,€¢) 


The independence from the choice of basis is more cumbersome to show, but it does hold, 


and thus det ¢ is well-defined. 


— 67 -— 


Note that ¢ needs to be an endomorphism because we need to apply w to ¢(€1),..., d(€a), 
and thus ¢ needs to output a vector. 

Of course, under the identification of ¢@ as a matrix, this definition coincides with the 
usual definition of determinant, and all your favourite results about determinants can be 


derived from it. 


Remark 8.32. In your linear algebra course, you may have shown the the determinant is 
basis-independent as follows: if A denotes the change of basis matrix, then 


det(A~1¢A) = det(A~*) det(#) det(A) = det(A~1 A) det(¢) = det(¢) 


since scalars commute, and det(A~! A) = det(I) = 1. 
Recall that the transformation rule for a bilinear form g under a change of basis is 
g — A’gA. The determinant of g then transforms as 


det(A7gA) = det(A’) det(g) det(A) = (det A)? det(g) 


i.e. it not invariant under a change of basis. It is not a well-defined object, and thus we 
should not use it. 
We will later meet quantities X that transform as 


1 
5 ee 
ma (det A)? 


under a change of basis, and hence they are also not well-defined. However, we obviously 
have 
det(g)X > ee det(g)X = det(g)X 

so that the product det(g)X is a well-defined object. It seems that two wrongs make a 
right! 

In order to make this mathematically precise, we will have to introduce principal fibre 
bundles. Using them, we will be able to give a bundle definition of tensor and of tensor 
densities which are, loosely speaking, quantities that transform with powers of det A under 


a change of basis. 
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9 Differential structures: the pivotal concept of tangent vector spaces 


9.1 Tangent spaces to a manifold 


In this section, whenever we say “manifold”, we mean a (real) d-dimensional differentiable 
manifold, unless we explicitly say otherwise. We will also suppress the differentiable struc- 


ture in the notation. 


Definition. Let M be a manifold. We define the infinite-dimensional vector space over R 
with underlying set 
C“(M) :={f: M—>R| f is smooth} 


and operations defined pointwise, i.e. for any pe M, 


(f + 9)(p) = f(p) + 9(p) 
(Af) (p) = Af (p)- 


A routine check shows that this is indeed a vector space. We can similarly define 
C*(U), with U an open subset of M. 


Definition. A smooth curve on M is a smooth map y: R — M, where R is understood as 
a 1-dimensional manifold. 


This definition also applies to smooth maps I + M for an open interval JC R. 


Definition. Let y: R > M be a smooth curve through p € M; w.lo.g. let y(0) = p. The 
directional derivative operator at p along 7¥ is the linear map 


Xy»:C~(M) >R 
f > (for), 
where R is understood as a 1-dimensional vector space over the field R. 


Note that foy is a map R — R, hence we can calculate the usual derivative and 


evaluate it at 0. 


Remark 9.1. In differential geometry, X¥y,» is called the tangent vector to the curve y at 
the point p € M. Intuitively, X,,, is the velocity y at p. Consider the curve 6(t) := y(2t), 
which is the same curve parametrised twice as fast. We have, for any f € C°(M): 


X5p(f) = (f 0 6)'(0) = 2(f 0-7)'(0) = 2X40(f) 
by using the chain rule. Hence X,,» scales like a velocity should. 


Definition. Let M be a manifold and p € M. The tangent space to M at p is the vector 
space over R with underlying set 


T,M := {X,,» | y is a smooth curve through p}, 
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addition 


®: T,M x T,M > T,M 
(X4,p, X5,p) > Xy,p B Xbp, 


and scalar multiplication 


©:Rx T,M > TM 
(A, X7,p) AAG Xyp, 


both defined pointwise, i.e. for any f € C°(M), 


(Xyp B Xp) (f) = Xyp(f) + X5p(f) 
(A© Xv) (f) = AX p(f). 


Note that the outputs of these operations do not look like elements in T7,M, because 
they are not of the form X,,, for some curve o. Hence, we need to show that the above 


operations are, in fact, well-defined. 


Proposition 9.2. Let X,,X5») € T,M and X € R. Then, we have X,p) 8 X5p € TpM 
and AO Xyp € TM. 


Since the derivative is a local concept, it is only the behaviour of curves near p that 
matters. In particular, if two curves y and 6 agree on a neighbourhood of p, then X,, and 
X65, are the same element of TM. Hence, we can work locally by using a chart on M. 


Proof. Let (U,x) be a chart on M, with U a neighbourhood of p. 


i) Define the curve 
a(t) = #*((x07)(t) + (x0 6)(t) — a(p)). 
Note that o is smooth since it is constructed via addition and composition of smooth 


maps and, moreover: 


o(0) = 27 *(x(7(0)) + 2(5(0)) — x(p)) 
= ax '(«(p)) + x(p) — x(p)) 
= «~*(a(p)) 
= 


Thus o is a smooth curve through p. Let f € C~*(U) be arbitrary. Then we have 


Neots (foe) (0) 
= [foro ((x0 7) + (x06) — 2(p))]'(0) 


where (fox): R4 > Rand ((ro7y)+(x06)—2(p)): R — R¢, so by the multivariable 
chain rule 


= [0a(f 0 x~*)(2(p))] ((x* 07) + (2% 0 6) — 2(p))'(0) 
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where x*, with 1 < a < d, are the component functions of x, and since the derivative 
is linear, we get 


= [da(f oa *)(x(p))] ((a* 0 7)/(0) + (x7 0 6)'(0)) 
= (fox toxroy)(0)+(foa*oxod)(0) 
= (fo 7)'(0) + (f° 6)'(0) 
= (Xyp © X5p)(f). 
Therefore X1» BX5p = Xap € TpM. 


ii) The second part is straightforward. Define o(t) := y(At). This is again a smooth 
curve through p and we have: 


forany f €C°(U). Hence \@ Xy5 = Aap € EyM: 


Remark 9.3. We now give a slightly different (but equivalent) definition of T,M. Consider 
the set of smooth curves 


S={y: 1 — M | with I CR open, 0 € J and ¥(0) = p} 
and define the equivalence relation ~ on $' 
y~d xe (x04)'(0) = (wo d)'(0) 
for some (and hence every) chart (U, x) containing p. Then, we can define 
TpM :=S/~. 
9.2 Algebras and derivations 


Before we continue looking at properties of tangent spaces, we will have a short aside on 
algebras and derivations. 


Definition. An algebra over a field K is a quadruple (A,+,-,¢), where (A,+,-) is a K- 
vector space and e is a product on A, i.e. a (K-)bilinear map e: Ax A> A. 


Example 9.4. Define a product on C®(M) by 


e: C~(M) x C*(M) > C*(M) 
(fg) feg, 


where f eg is defined pointwise. Then (C°(M),+,-,¢) is an algebra over R. 
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The usual qualifiers apply to algebras as well. 
Definition. An algebra (A,+,-,¢) is said to be 


i) associative if Vu,w,z € A: ve(wez) = (vew) ez; 


ii) unital if JIE A:VuEV:lev=vel=v,; 
iii) commutative or abelian if Vu,weE A:vew=wer. 


Example 9.5. Clearly, (C°(M),+,-,¢) is an associative, unital, commutative algebra. 


An important class of algebras are the so-called Lie algebras, in which the product vew 


is usually denoted |v, w]. 
Definition. A Lie algebra A is an algebra whose product [—, —], called Lie bracket, satisfies 

i) antisymmetry: Vu € A: [v,v] = 0; 

ii) the Jacobi identity: Vvu,w,z€ A: [v, [w, z]] + [w, [z, v]] + [z, [v, w] = 0. 
Note that the zeros above represent the additive identity element in A, not the zero scalar 

The antisymmetry condition immediately implies [v,w] = —[w,v] for all v,w € A, 
hence a (non-trivial) Lie algebra cannot be unital. 
Example 9.6. Let V be a vector space over kK. Then (End(V),+,-,0) is an associative, 
unital, non-commutative algebra over K. Define 
[—, —]: End(V) x End(V) > End(V) 
(¢,0) > [¢,¥] = Gov-—pod. 

It is instructive to check that (End(V),+,-,[—,—]) is a Lie algebra over Kk. In this case, 
the Lie bracket is typically called the commutator. 


In general, given an associative algebra (A,-+,-,¢), if we define 
[u,w] = vew—wer, 
then (A,+,-,[—,—]) is a Lie algebra. 


Definition. Let A be an algebra. A derivation on A is a linear map D: A —> A satisfying 
the Leibniz rule 

D(vew) = D(v)ew+ve D(w) 
for all v,w € A. 


Remark 9.7. The definition of derivation can be extended to include maps A — B, with suit- 
able structures. The obvious first attempt would be to consider two algebras (A, +4, +4, ©), 
(B,+B,-B,¢B), and require D: A —> B to satisfy 


Dive, w) = Div) egpwteveg D(w). 


However, this is meaningless as it stands since eg: B x B > B, but on the right hand side 
ep acts on elements from A too. In order for this to work, B needs to be a equipped with 
a product by elements of A, both from the left and from the right. The structure we are 
looking for is called a bimodule over A, and we will meet this later on. 
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Example 9.8. The usual derivative operator is a derivation on C®(R), the algebra of smooth 
real functions, since it is linear and satisfies the Leibniz rule. 

The second derivative operator, however, is not a derivation on C%®(R), since it does 
not satisfy the Leibniz rule. This shows that the composition of derivations need not be a 


derivation. 
Example 9.9. Consider again the Lie algebra (End(V),+,-,{—,—]) and fix € € End(V). If 
we define 
Deg := [€,—]: End(V) > End(V) 
or [ed], 
then De¢ is a derivation on (End(V),+,-,[—,—]) since it is linear and 
Dello, ¥)) = [6 16, YI 
= —|¥,[é, oll — [¢, [v, €]] (by the Jacobi identity) 
= [16 4], 4] + [4 [8 vl] (by antisymmetry) 


=: [De(¢), ¥] + [¢, De()].- 


This construction works in general Lie algebras as well. 


Example 9.10. We denote by Derx(A) the set of derivations on a K-algebra (A,+,-,¢). 
This set can be endowed with a K-vector space structure by defining the operations point- 
wise but, by a previous example, it cannot be made into an algebra under composition of 
derivations. 


However, derivations are maps, so we can still compose them as maps and define 


[—, —]: Dern (A) x Derg (A) - Derg (A) 
(D1, D2) > [D1, D2] = Di ° Do = Do ° D4. 


The map [Dj , Dg] is (perhaps surprisingly) a derivation, since it is linear and 


[D1, D2|(v e w) := (Di o Dz — Dz 0 Di) (ve w) 
= D,(D2(vew)) — D2(Di(v ew)) 


ew) + Di(v e Do(w)) — Do(Di(v) ew) — Do(v e Di(w)) 

) ew + Do(v}0Di(w) + Di(v)}-e Da(w) + v e Di (D2(w)) 
v)) ew — Div) D3(w) — Do(v)-e Dy(w) — ve D2(Di(w)) 
= (Di (D2(v)) — Do(Di(v))) ew + ve (Di(D2(w)) — D2(Di(w))) 

= [Dj, Da](v) ew + ve [Dj, D2|(w) 


Then (Derx(A),+,-,[—,—]) is a Lie algebra over K. 


If we have a manifold, we can define the related notion of derivation on an open subset 
at a point. 
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Definition. Let M be a manifold and let p € U C M, where U is open. A derivation on 
U at p is an R-linear map D: C°(U) > R satisfying the Leibniz rule 


D(fg) = D(f)g(p) + f(p)D(g)- 


We denote by Der,(U) the R-vector space of derivations on U at p, with operations defined 


pointwise. 


Example 9.11. The tangent vector X, is a derivation on U C M at p, where U is any 
neighbourhood of p. In fact, our definition of the tangent space is equivalent to 


TpM := Der,(U), 


for some open U containing p. One can show that this does not depend on which neigh- 
bourhood U of p we pick. 
9.3. A basis for the tangent space 


The following is a crucially important result about tangent spaces. 


Theorem 9.12. Let M be a manifold and let p € M. Then 
dim T, M = dim M. 


Remark 9.13. Note carefully that, despite us using the same symbol, the two “dimensions” 
appearing in the statement of the theorem are, at least on the surface, entirely unrelated. 
Indeed, recall that dim M is defined in terms of charts (U,x), with x: U > 2(U) C R&@™, 
while dim T,,.M = |B|, where B is a Hamel basis for the vector space T,M. The idea behind 
the proof is to construct a basis of T,,M from a chart on M. 


Proof. W.1.0.g., let (U,x) be a chart centred at p, i.e. x(p) =0 € R4™™. Define (dim M)- 
many curves 7(q): R — U through p by requiring (2° 0 Va) )(t) = Orb, he) 

Ya) (0) = p 

Va) (t) = ONO. 00,2220) 


where the t is in the a" position, with 1 < a < dim M. Let us calculate the action of the 
tangent vector X¥,)» € TpM on an arbitrary function f € Cte 


Xv) .(f) = (fo Ya)) (0) 
= (f cidy o7%q))'(0) 
=(foe oso 5)'(0) 
= [0,(f o27")(a(p))] (2° © Ya)'(0) 
= [0,(f o27")(a(p))] (dat) (0) 
[A( fo a~")(a(p))] 5 
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We introduce a special notation for this tangent vector: 


0 
=) :=xXx 
(4 ) Y(a)»P? 


where the x refers to the chart map. We now claim that 


pe (ee eT)M|1<a<dimM 
Ox" } 


is a basis of 7M. First, we show that B spans T,,M. 
Let X € T,M. Then, by definition, there exists some smooth curve 0 through p such 
that X = Xy,». For any f € C°(U), we have 


X(f) = Xopl(f) 
= (f 00)'(0) 


Since (x? 0 a)'(0) =: X° € R, we have: 


0 
y= x 
(3 ) : 


i.e. any X € TM is a linear combination of elements from B. 


To show linear independence, suppose that 


“(a - 
o P 


for some scalars A*. Note that this is an operator equation, and the zero on the right hand 
side is the zero operator 0 € T,M. 

Recall that, given the chart (U,x), the coordinate maps x’: U — R are smooth, i.e. 
x’ €C*~(U). Thus, we can feed them into the left hand side to obtain 


O 
0= A? x? 
( aa), | 
= 0%a,(c? o 2-1)(2(p)) 


= \* Oq(projy)(x(p)) 


i.e. \° = 0 for all 1 < 6 < dim M. So B is indeed a basis of TM, and since by construction 
|5| = dim M, the proof is complete. 
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Remark 9.14. While it is possible to define infinite-dimensional manifolds, in this course we 
will only consider finite-dimensional ones. Hence dim7,M = dim M will always be finite 
in this course. 


Remark 9.15. Note that the basis that we have constructed in the proof is not chart- 
independent. Indeed, each different chart will induce a different tangent space basis, and 
we distinguish between them by keeping the chart map in the notation for the basis elements. 

This is not a cause of concern for our proof however, since every basis of a vector space 
must have the same cardinality, and hence it suffices to find one basis to determine the 
dimension. 


Remark 9.16. While the symbol ( a7 )p has nothing to do with the idea of partial differen- 
tiation with respect to the variable x“, it is notationally consistent with it, in the following 
sense. 


Let M = R4, (U,x) = (R4, idga) and let (327), € TpR*. If f ¢ C*(R*), then 


ey (f) = Oa(f oa7+)(a(p)) = Oaf (p), 


since « = 2! = idga. Moreover, we have proj, = x*. Thus, we can think of 2!,..., 2% as 
the independent variables of f, and we can then write 


(gee) = gen 


Definition. Let X € T,M be a tangent vector and let (U,x) be a chart containing p. If 


xox (2). 
a Pp 


dim M 
_., xdim 


then the real numbers X!,. are called the components of X with respect to 


the tangent space basis induced by the chart (U,x). The basis {( 4 ot is also called a 


co-ordinate basis. 


Proposition 9.17. Let X € T,M and let (U,x) and (V,y) be two charts containing p. 


Then we have 
( ol = daly 0 27?)(0(p)) (ap) 


Proof. Assume w.l.o.g. that U = V. Since ( 7 ), € T,M and {( 


fe) 
al 


Ox® 


=), * (ap) 
ae a eee 
& - dy? /,, 


5 ),} forms a basis, we 


must have 
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for some X°. Let us determine what the \? are by applying both sides of the equation to 


the coordinate maps y°: 


Ox® 
(ss) (y°) =r’ O(y° oy ")(y(p)) 
= \ 0;(proj.)(y(p)) 
=)? se 
= °, 


Hence 
AN = Oaly® 0 x~")(x(p)). 


Substituting this expression for A° gives the result. 


Corollary 9.18. Let X € T,M and let (U,x) and (V,y) be two charts containing p. Denote 
by X* and X® the coordinates of X with respect to the tangent bases induced by the two 
charts, respectively. Then we have: 


X* = Ay(y% 0 x~")(a(p)) X°. 


Proof. Applying the previous result, 


xX=Xx (se). = X*a,(y? oa)(a(p)) (as). 


Hence, we read-off X= daly? oa ')(x(p)) X*. 


Remark 9.19. By abusing notation, we can write the previous equations in a more familiar 
form. Denote by y’ the maps y? o x-!: 2(U) C R&™™ _; R; these are real functions of 
dim M independent real variables. Since here we are only interested in what happens at 
the point p € M, we can think of the maps z!,...,27™™ as the independent variables of 
each of the y?. 

This is a general fact: if {*} is a singleton (we let * denote its unique element) and 
x: {x} > A, y: A — B are maps, then yo z is the same as the map y with independent 
variable x. Intuitively, x just “chooses” an element of A. 
; gdim My 


Hence, we have y’ = y?(a1,... and we can write 


(2) Bion (B) at = Bewe 


which correspond to our earlier eg = A? é and v? = AP v*. The function y = y(a) expresses 
the new co-ordinates in terms of the old ones, and AP. is the Jacobian matrix of this map, 
evaluated at x(p). The inverse transformation, of course, is given by 


Ox? 
Oy* 


BY, =(A'), = = (y(p)). 
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Remark 9.20. The formula for change of components of vectors under a change of chart 
suggests yet another way to define the tangent space to M at p. 
Let & : {(U,x) € L| p © U} be the set of charts on M containing p. A tangent vector 
v at p is a map 
. dim M 
U: Dy — RO™ 


satisfying 
o((V,y)) = Av((U, 2) 


where A is the Jacobian matrix of yo a7!: R¢™” — R4™™ at x(p). In components, we 


have 
b 
[o((V,y))) = OW (a(0)) [u((U, x))]*. 


The tangent space T;,M is then defined to be the set of all tangent vectors at p, endowed 
with the appropriate vector space structure. 

What we have given above is the mathematically rigorous version of the definition of 
vector typically found in physics textbooks, i.e. that a vector is a “set of numbers” v% which, 
under a change of coordinates y = y(x), transform as 


v= oy! v 
Ox® 


a 


For a comparison of the different definitions of T;,M that we have presented and a proof 
of their equivalence, refer to Chapter 2 of Vector Analysis, by Klaus Janich. 
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10 Construction of the tangent bundle 


10.1 Cotangent spaces and the differential 


Since the tangent space is a vector space, we can do all the constructions we saw previously 


in the abstract vector space setting. 
Definition. Let M be a manifold and p € M. The cotangent space to M at p is 
TM := (TpM)". 


Since dim TM is finite, we have T,M “vec T3M. If {(52 ae } is the basis of 7,M 
induced by some chart (U,x), then the dual basis is denoted as { (da )p}. We have, by 


definition 
a 0 a 
Pp 


Once we have the cotangent space, we can define the tensor spaces. 
Definition. Let M be a manifold and p € M. The tensor space (T?),M is defined as 


(TS )pM :=T;(TpM) =T,M ®---@T,M®T,M®---@T OM. 
Tr copies S copies 
Definition. Let M and N be manifolds and let 6: M — N be smooth. The differential 
(or derivative) of ¢ at p € M is the linear map 
dpd: TpM — Typ)N 
X ++ dpd (X) 
where dp¢(X) is the tangent vector to N at $(p) 
dpb (X): C*(N) +R 
+ (dpe (X))(9) = X(g0 4). 
If this definition looks confusing, it is worth it to pause and think about what it is 


saying. Intuitively, if @ takes us from M to N, then dp¢ takes us from T,M to Typ) N. The 


way in which it does so, is the following. 


6 


M ———> N c~(M) e— _ eon ) 
g 
gob r dpg (X) 
R R 


Given X € TM, we want to construct dp¢(X) € Typ) N, ie. a derivation on N at f(p). 
Derivations act on functions. So, given g: N — R, we want to construct a real number 
by using ¢ and X. There is really only one way to do is. If we precompose g with ¢, we 
obtain go ¢: M — R, which is an element of C°(M). We can then happily apply X to 
this function to obtain a real number. You should check that d,@(X) is indeed a tangent 
vector to N. 
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Remark 10.1. Note that, to be careful, we should replace C°(M) and C°(N) above with 
c~(U) andC™(V), where U C M and V C N are open and contain p and ¢(p), respectively. 


Example 10.2. If M = R¢ and N = R®, then the differential of f: R? > R” at peR? 
dp f: TpR? ~vec R4 + TypyR® vec R® 
is none other than the Jacobian of f at p. 
A special case of the differential is the gradient of a function in C°(M). 


Definition. Let M be a manifold and let f: 4 — R be smooth. The gradient of f at 
p € M is the covector 


dpf: TpM — TyipyR =vec R 
X 4 dpf(X) = X(f). 
In fact, we can define the gradient operator at p © M as the R-linear map 
dp: C*(U) > T3M 
fH dof, 
with pe UCM. 


Remark 10.3. Note that, by writing dpf(X) := X(f), we have committed a slight (but 
nonetheless real) abuse of notation. Since dp, f(X) € Tyip)R, it takes in a function and 
return a real number, but X(f) is already a real number! This is due to the fact that we 
have implicitly employed the isomorphism 
tg: TpR? > R4 
Dae (X (proj), HS , X(projq)), 


which, when d = 1, reads 
4:7,R—>R 
X + X(idp). 
In our case, we have 
dp f(X) = X(—0 f) > X(idrof) = X(f). 


This notwithstanding, the best way to think of d,f is as a covector, i.e. dpf takes in a 
tangent vector X and returns the real number X(f), in a linear fashion. 


Recall that if (U, x) is a chart on M, then the co-ordinate maps 27: U > 2(U) C R¢™™” 
are smooth functions on U. We can thus apply the gradient operator d, (with p € U) to 
each of them to obtain (dim M)-many elements of T> M. 


Proposition 10.4. Let (U,x) be a chart on M, with p€ U. The set B= {dpz* |1<a< 
dim M} forms a basis of TM. 


— 80 —- 


Proof. We already know that T7M = dim M, since it is the dual space to T,M. As 
|6| = dim M by construction, it suffices to show that it is linearly independent. Suppose 
that 

Agdga*= 0, 


for some Aq € R. Applying the left hand side to the basis element ( Be )p yields 


7 re) 7 0 : ae - 
Na Apt (2) ) oe (aes). (x*) (definition of d,x“) 


= dq 0,(x% 0 a~!)(zx(p)) (definition of Cae 
= Aq O(proja)(x(P)) 


Therefore, B is linearly independent and hence a basis of TM. Moreover, since we have 


shown that 
dpx® ao = 6; 
pL Ax? . — %bs 


this basis is, in fact, the dual basis to {(a2z),}- 


Remark 10.5. Note a slight subtlety. Given a chart (U,x) and the induced basis {(22),} 


of T,,M, the dual basis to {(a22),$ exists simply by virtue of T7’M being the dual space to 
T,M. What we have shown above is that the elements of this dual basis are given explicitly 
by the gradients of the co-ordinate maps of (U, x). In our notation, we have 


(die Sdn"; l<a<dimmM. 


10.2. Push-forward and pull-back 


The push-forward of a smooth map ¢: M > N at p € M is just another name for the 
differential of @ at p. We give the definition again in order to establish the new notation. 


Definition. Let ¢: M — N be a smooth map between smooth manifolds. The push- 
forward of ¢ at p € M is the linear map: 


(b+)p: TpM — Tyip)N 
X + (¢s)p(X) = X(— 0). 


If y: R- M is asmooth curve on M and ¢: M — N is smooth, then doy: R- N 
is a smooth curve on N. 


Proposition 10.6. Let ¢: M — N be smooth. The tangent vector X, € TpM is pushed 
forward to the tangent vector X goy,¢(p) © To(p)N; 1-€- 


(b)p(Xy,p) = X407,0(p)- 
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Proof. Let f € C*°(V), with (V,2) a chart on N and ¢(p) € V. By applying the definitions, 
we have 


(+ )p(Xy~)(F) = (Xyp)(F ° 4) 
= ((f° 9) 0 7)'(0) 
= (fo (¢07))(0) 

= X407,¢(p) (F) 


definition of (¢.)p) 
definition of Xn a) 


associativity of 0) 


mZa~ mmnaan 


definition of X, $07,0(p)) 


Since f was arbitrary, we have (¢%)p(X7,p) = X¢oy,¢(p)- 
Related to the push-forward, there is the notion of pull-back of a smooth map. 


Definition. Let ¢: M — N be a smooth map between smooth manifolds. The pull-back 
of ¢@ at p € M is the linear map: 


(¢*)p: T3pyN — Tp M 
wr ($")p(w), 
where (¢*),(w) is defined as 


($*) p(w): T,>M > R 
X + w((b«)p(X)), 


In words, if w is a covector on N, its pull-back (¢*),(w) is a covector on M. It acts 
on tangent vectors on M by first pushing them forward to tangent vectors on N, and then 
applying w to them to produce a real number. 


Remark 10.7. If you don’t see it immediately, then you should spend some time proving 


that all the maps that we have defined so far and claimed to be linear are, in fact, linear. 


Remark 10.8. We have seen that, given a smooth ¢: M — WN, we can push a vector 
X €T,M forward to a vector (¢«)p(X) € Typ) N, and pull a covector w € T5(p)N back to 
a covector (¢*)p(w) € TM. 


20(M) <——*__ c(1N 1 a 


be Al 


However, if ¢: M — N is a diffeomorphism, then we can also pull a vector Y € Typ) N 
back to a vector (¢*)p(Y) € TpM, and push a covector n € TM forward to a covector 
(d«)p(n) € Tp) N, by using ¢ | as follows: 


(O° )p(¥) = (Oa) op (¥) 
(dx) (0) = ((07")*) op) (n)- 
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—og-1 ae 
c™(M) —* _, e(.N) TM ¢ SEO _ a (NV 
Y n 
($")p(¥Y) ($)p(n) 
R R 


This is only possible if ¢ is a diffeomorphism. In general, you should keep in mind that 


Vectors are pushed forward, 
covectors are pulled back. 


Remark 10.9. Given a smooth map ¢: M > N, if f € C*(N), then fo¢ is often called the 
pull-back of f along ¢. Similarly, if 7 is a curve on M, then $07 is called the push-forward 
of y along ¢. For example, we can say that 


e the push-forward of a tangent vector acting on a function is the tangent vector acting 
on the pull-back of the function; 


e the push-forward of a tangent vector to a curve is the tangent vector to the push- 


forward of the curve. 


10.3. Immersions and embeddings 


We will now consider the question of under which circumstances a smooth manifold can “sit” 
in R4, for some d € N. There are, in fact, two notions of sitting inside another manifold, 
called immersion and embedding. 


Definition. A smooth map ¢: M —> N is said to be an immersion of M into N if the 
derivative 


dp? = (b4)p: pM > Typ)N 
is injective, for all p € M. The manifold M is said to be an immersed submanifold of N. 


From the theory of linear algebra, we immediately deduce that, for 6: M — N to be an 
immersion, we must have dim M < dim N. A closely related notion is that of a submersion, 
where we require each (¢,), to be surjective, and thus we must have dim M > dim N. 
However, we will not need this here. 


Example 10.10. Consider the map ¢: S' — R? whose image is reproduced below. 


PP) 
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The map ¢ is not injective, i.e. there are p,q € S', with p 4 q and ¢(p) = ¢(q). Of 
course, this means that Tyc,)R? = Tyq)R°. However, the maps (@,)p and (,)q are both 
injective, with their images being represented by the blue and red arrows, respectively. 
Hence, the map ¢ is immersion. 


Definition. A smooth map ¢: M — N is said to be a (smooth) embedding of M into N if 
e ¢: M > N is an immersion; 
e M ~top o(M) C N, where ¢(M) carries the subset topology inherited from N. 

The manifold M is said to be an embedded submanifold of N. 


Remark 10.11. If a continuous map between topological spaces satisfies the second condi- 
tion above, then it is called a topological embedding. Therefore, a smooth embedding is a 
topological embedding which is also an immersion (as opposed to simply being a smooth 
topological embedding). 


In the early days of differential geometry there were two approaches to study manifolds. 
One was the extrinsic view, within which manifolds are defined as special subsets of R?, 
and the other was the intrinsic view, which is the view that we have adopted here. 

Whitney’s theorem, which we will state without proof, states that these two approaches 
are, in fact, equivalent. 


Theorem 10.12 (Whitney). Any smooth manifold M can be 


e embedded in R24". 


e immersed in R24mM-1_ 


Example 10.13. The Klein bottle can be embedded in R* but not in R?. It can, however, 
be immersed in R°. 


What we have presented above is referred to as the strong version of Whitney’s theorem. 
There is a weak version as well, but there are also even stronger versions of this result, such 
as the following. 


Theorem 10.14. Any smooth manifold can be immersed in R24™M—a(dim M) | where a(n) 


is the number of 1s in a binary expansion of n EN. 


Example 10.15. If dim M = 8, then as 
319 = (1 x 2b 1 x 29 = Ma, 


we have a(dim M) = 2, and thus every 3-dimensional manifold can be immersed into R*. 
Note that even the strong version of Whitney’s theorem only tells us that we can immerse 
M into R?. 


— 84 — 


10.4 The tangent bundle 


We would like to define a vector field on a manifold M as a “smooth” map that assigns 
to each p € M a tangent vector in T,M. However, since this would then be a “map” to a 
different space at each point, it is unclear how to define its smoothness. 

The simplest solution is to merge all the tangent spaces into a unique set and equip it 
with a smooth structure, so that we can then define a vector field as a smooth map between 
smooth manifolds. 


Definition. Given a smooth manifold M, the tangent bundle of M is the disjoint union of 
all the tangent spaces to M, i.e. 


TM := || 1M, 


pEeM 
equipped with the canonical projection map 
qm: TM—>M 
XD, 


where p is the unique p € M such that X € T,M. 


We now need to equip 7M with the structure of a smooth manifold. We can achieve 
this by constructing a smooth atlas for TM from a smooth atlas on M, as follows. 

Let &y be a smooth atlas on M and let (U,x) € Gy. If X € preim,(U) C TM, then 
X € T,(x)M, by definition of 7. Moreover, since m(X) € U, we can expand X in terms of 
the basis induced by the chart (U, x): 


xaxe(2.) 
Ox m(X) 


where X!,...,X4™™ ER. We can then define the map 


fepreim:(U) > 2(U) x RO" = Reo 


Bac. ©. @) He Gene Gea 


Assuming that TM is equipped with a suitable topology, for instance the initial topol- 
ogy (i.e. the coarsest topology on TM that makes 7 continuous), we claim that the pair 
(preim,(U),€) is a chart on TM and 


dry = {(preim,(U),£) | (U,#) € Lu} 


is a smooth atlas on TM. Note that, from its definition, it is clear that € is a bijection. We 
will not show that (preim,(U),€) is a chart here, but we will show that <p), is a smooth 
atlas. 


Proposition 10.16. Any two charts (preim,(U), €), (preim,(U), €) € &pyy are C®-compatible. 
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Proof. Let (U,x) and (U,%) be the two charts on M giving rise to (preim,(U),€) and 
(preim, (U),€), respectively. We need to show that the map 


€0€ +: &(U NU) x R8™™ — BUNT) x RUMM 


is smooth, as a map between open subsets of R?¢™™_ Recall that such a map is smooth 
if, and only if, it is smooth componentwise. On the first dim M components, £0 €~! acts as 


@oa!: e(U NU) > X(UNU) 
x(p) ++ (p), 


while on the remaining dim M components it acts as the change of vector components we 


met previously, i.e. 
X% 4+ X% = &h(y* 0 a!) (a(p)) X°. 


Hence, we have 


fog}: gO) XR 
(EO) Oe KO) EX OO 


which is smooth in each component, and hence smooth. 


The tangent bundle of a smooth manifold M is therefore itself a smooth manifold 
of dimension 2dim M, and the projection 7: TM — M is smooth with respect to this 
structure. 

Similarly, one can construct the cotangent bundle T*M to M by defining 


TM := || TM 
pEeM 


and going through the above again, using the dual basis {(dx*),,} instead of {(32 Jer 
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11 Tensor space theory II: over a ring 


11.1 Vector fields 


Now that we have defined the tangent bundle, we are ready to define vector fields. 


Definition. Let M be a smooth manifold, and let TM +> M be its tangent bundle. A 
vector field on M is asmooth section of the tangent bundle, i.e. asmooth map 0: M > TM 


such that 700 = idy. 
TM 


M 
We denote the set of all vector fields on M by ['(T'M), ie. 
I(TM) := {o0: M 4 TM |c is smooth and 700 = idyy}. 


This is, in fact, the standard notation for the set of all sections on a bundle. 


Remark 11.1. An equivalent definition is that a vector field o on M is a derivation on the 


algebra C®(M), i.e. an R-linear map 
ao: C™(M) + C*(M) 
satisfying the Leibniz rule (with respect to pointwise multiplication on C®(M)) 
o(fg) =go(f) + folg). 


This definition is better suited for some purposes, and later on we will switch from one to 
the other without making any notational distinction between them. 


Example 11.2. Let (U,x) be a chart on M. For each 1 < a < dim M, the map 
a:U>~TU 
Py eo 
. Ox? },, 
is a vector field on the submanifold U. We can also think of this as a linear map 


9. ry) 2 C(U) 
Ox® F 


Ox% 


By abuse of notation, one usually denotes the right hand side above simply as 0,f. 


fry a Z(f) = Oal(foa)oz. 


> R U > R 
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Recall that, given a smooth map ¢: M — N, the push-forward (¢,)p is a linear map 
that takes in a tangent vector in T,M and outputs a tangent vector in Ty(,)N. Of course, 
we have one such map for each p € M. We can collect them all into a single smooth map. 


Definition. Let 6: M — N be smooth. The push-forward ¢, is defined as 


od,: TM > TN 


Any vector X € TM must belong to T,M for some p € M, namely p = 1(X). The 
map ¢, simply takes any vector X € TM and applies the usual push-forward at the “right” 
point, producing a vector in T’N. One can similarly define ¢*: T*N > T*M. 

The ideal next step would be to try to construct a map ®,: [(7M) > T(TN) that 
allows us to push vector fields on M forward to vector fields on N. Given o € [(TM), we 
would like to construct a ®,(0) € [(£N). This is not trivial, since ®,(o0) needs to be, in 
particular, a smooth map N + TN. Note that the composition ¢,o¢0 isamap M > TN, 
and hence img,o¢(/) C TN. Thus, we can try to define ®,(7) by mapping each p € N 
to some tangent vector in img,o¢(//). Unfortunately, there are at least two ways in which 


this can go wrong. 


1. The map ¢ may fail to be injective. Then, there would be two points p,, p2 € M such 
that p; # p2 and (pi) = ¢(p2) =: q € N. Hence, we would have two tangent vectors 
on N with base-point q, namely (¢, 00)(p1) and (¢, 00)(p2). These two need not be 
equal, and if they are not then the map ®,(c) is ill-defined at gq. 


2. The map @ may fail to be surjective. Then, there would be some gq € N such that 
there is no X € img,og(M) with 7(X) = q (where 7: TN — N). The map ®,(c) 
would then be undefined at q. 


3. Even if the map ¢ is bijective, its inverse ¢~' may fail to be smooth. But then ®,(c) 
would not be guaranteed to be a smooth map. 


Of course, everything becomes easier if 6: M —- N is a diffeomorphism. 


TM —*_4 TN 


o ®.(c) 


M—*__4N 


If o € T (2M), we can define the push-forward ®,(0) € T(T'N) as 
®,(c):=d,0a0g'. 


More generally, if ¢: M — N is smooth and o € I(T M), r € T(TN), we can define 
®,(0) = 7 if o and 7 are ¢-related, i.e. if they satisfy 


TOP= Gy 00. 
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We can equip the set [(7'M) with the following operations. The first is our, by now 
familiar, pointwise addition: 
@:T(TM) x I(TM) > T(TM) 


(0,T) 40 @T, 
where 


0@7T: M>I(TM) 
p+ (0 ®7)(p) = o(p) + T(p). 


Note that the + on the right hand side above is the addition in 7,,M. More interestingly, 
we also define the following multiplication operation: 


©: C®(M) x I(TM) + T(TM) 
(hens Og; 


where 


foo: M—->IT(TM) 
p++ (fOa)(p) = f(p)o(p). 
Note that since f € C°(M), we have f(p) € R and hence the multiplication above is the 
scalar multiplication on T,M. 


If we consider the triple (C°(M), +,¢), where e is pointwise function multiplication as 
defined in the section on algebras and derivations, then the triple ([(7'M), ©, ©) satisfies 


e ([(7M),®) is an abelian group, with 0 € [(7M) being the section that maps each 
p€M to the zero tangent vector in T,,M; 


e I(TM) \ {0} satisfies 


i) Vf €C@(M) :Vo,7 €T(TM) \ {0}: fO(o GT) =(f Oo) S(f O7); 
li) Vf,gEC°(M):VoeET(TM)\ {0}: (f+9) Oc =(fOa) S909); 
iii) Vf,g €C°(M) :0 €T(TM) \ {0}: (feg) Oo =fO(gOa); 
) 


iv) Vo ET(TM) \ {0}: 1lOa=a, 
where 1 € C°(M) maps every p€ M tol eR. 


These are precisely the axioms for a vector space! The only obstacle to saying that 
I(T'M) is a vector space over C*(M) is that the triple (C°(M),+,¢) is not an algebraic 
field, but only a ring. We could simply talk about “vector spaces over rings”, but vector 
spaces over ring have wildly different properties than vector spaces over fields, so much so 
that they have their own name: modules. 
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Remark 11.3. Of course, we could have defined © simply as pointwise global scaling, using 
the reals R instead of the real functions C°(M). Then, since (R,+,-) is an algebraic field, 
we would then have the obvious R-vector space structure on ['(7M). However, a basis for 
this vector space is necessarily uncountably infinite, and hence it does not provide a very 
useful decomposition for our vector fields. 

Instead, the operation © that we have defined allows for local scaling, i.e. we can scale 
a vector field by a different value at each point, and a much more useful decomposition of 
vector fields within the module structure. 


11.2. Rings and modules over a ring 


Unlike mathematicians, most people who apply mathematics tend to consider rings and 
modules somewhat esoteric objects, but they are not esoteric at all. As we have seen, they 
arise naturally in the study of manifolds and their unusual properties, at least when com- 
pared to fields and vector spaces, are of direct geometric relevance and make us understand 
the subject better. 

For your benefit, we first recall some basic facts about rings. 


Definition. A ring is a triple (R,+,-), where R is a set and +,-: R x R > R are maps 
satisfying the following axioms 


e (R,+) is an abelian group: 

i) Va,b,c€ R: (a+b) +c=a+(b+0c); 
)d0E€R:VaceR:a+0=0+4+a=a; 
iii) Va€ R: d-a€e R:a+(-a) = (-a)+a=0; 
iv) Va,be€ R:a+b=6b+4a4; 
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e the operation - is associative and distributes over addition: 


v) Va,b,ce R: (a-b)-c=a- (b-c); 
vi) Va,b,c€ R: (a+ b)-c=a-ct+b-c, 
vii) Va,b,c€ R:a-(b+c)=a-b+a-ec. 


Note that since - is not required to be commutative, axioms vi and vii are both necessary. 
Definition. A ring (R,+,-) is said to be 


e commutative if Va,be R:a-b=b-a; 


e unital if JIE R:VaeR:1-a=a-1=4; 


e a division (or skew) ring if it is unital and 


Vaeé R\ {0}: 3a eR\ {0}: ae * =a! a=1. 
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In a unital ring, an element for which there exists a multiplicative inverse is said to 
be a unit. The set of units of a ring R is denoted by R* (not to be confused with the 
vector space dual) and forms a group under multiplication. Then, R is a division ring iff 


R* = R\ {0}. 


Example 11.4. The sets Z, Q, R, and C are all rings under the usual operations. They are 
also all fields, except Z. 


Example 11.5. Let M be a smooth manifold. Then 


e (C“(M),+,-), where - is scalar multiplication (by a real number), is an R-vector 
space. It is not a ring since - is not a map C*(M) x C*(M) > C™(M). 


e (C~(M),+,¢), where e is pointwise multiplication of maps, is a commutative, unital 
ring, but not a division ring and hence, not a field. 


In general, if (A,+,-,¢) is an algebra, then (A,+,e) is a ring. 
Definition. Let (R,+,-) be a unital ring. A triple (V7, ,©) is called an R-module if the 
maps 

e:MxMaM 

©: RxM7>~M 


satisfy the vector space axioms, i.e. (IW7,@) is an abelian group and for all r,s € R and all 
m,n € M, we have 


i) rO(mOn) =(r Om) S(r On); 
ii) (r+s)Om=(rOm)@(sOm); 
iii) (r-s)Om=roO(sOm); 
iv) L\Om=mMm. 
Most definitions we had for vector spaces carry over unaltered to modules, including 


that of a basis, i.e. a linearly independent spanning set. 


Remark 11.6. Even though we will not need this, we note as an aside that what we have 
defined above is a left R-module, since multiplication has only been defined (and hence 
only makes sense) on the left. The definition of a right R-module is completely analogous. 
Moreover, if R and S are two unital rings, then we can define M to be an R-S-bimodule 
if it is a left R-emodule and a right S-module. The bimodule structure is precisely what is 
needed to generalise the notion of derivation that we have met before. 


Example 11.7. Any ring R is trivially a module over itself. 
Example 11.8. The triple (I(7M), ©, ©) is a C°(M)-module. 


In the following, we will usually denote 6 by + and suppress the ©, as we did with 
vector spaces. 
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11.3. Bases for modules 

The key fact that sets modules apart from vector spaces is that, unlike a vector space, an 
R-module need not have a basis, unless R is a division ring. 

Theorem 11.9. Jf D is a division ring, then any D-module V admits a basis. 


Corollary 11.10. Every vector space has a basis, since any field is also a division ring. 


Before we delve into the proof, let us consider some geometric examples. 


Example 11.11. a) Let M = R? and consider v € I'(TR?). It is a fact from standard 
vector analysis that any such v can be written uniquely as 


v= vie} + veg 


for some v!,v? € C®(R?) and e1,e2 € I'(TR?). Hence, even though T'(TR?) is a 
C®(R?)-module and C®(IR?) is not a division ring, it still has a basis. Note that the 
coefficients in the linear expansion of v are functions. 


This example shows that the converse to the above theorem is not true: if D is not a 


division ring, then a D-module may or may not have a basis. 


b) Let M = $7. A famous result in algebraic topology, known as the hairy ball theorem, 
states that there is no non-vanishing smooth tangent vector field on even-dimensional 
n-spheres. Hence, we can multiply any smooth vector field v € I'(T'S?) by a function 
f € C*(S?) which is zero everywhere except where v is, obtaining fv = 0 despite 
f #0 and v 40. Therefore, there is no set of linearly independent vector fields on 
S?, much less a basis. 


The proof of the theorem requires the axiom of choice, in the equivalent form known 
as Zorn’s lemma. 


Lemma 11.12 (Zorn). A partially ordered set P whose every totally ordered subset T has 
an upper bound in P contains a maximal element. 


Of course, we now need to define the new terms that appear in the statement of Zorn’s 
lemma. 


Definition. A partially ordered set (poset for short) is a pair (P,<) where P is aset and < 
is a partial order on P, i.e. a relation on P satisfying 


i) reflexivity: Vae P:a<a; 
ii) anti-symmetry: Va,be P:(a<bAb<a)>a=b); 
iii) transitivity: Va,b,c€ P:(a<bAb<c)SaK<c. 


In a partially ordered set, while every element is related to itself by reflexivity, two 
distinct elements need not be related. 
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Definition. A totally ordered set is a pair (P,<) where P is a set and < is a total order 
on P, i.e. a relation on P satisfying 


a) anti-symmetry: Va,b€ P:(a<bAb<a)Sa=b; 
b) transitivity: Va,b,c€ P:(a<bAb<c)SaK<c 
c) totality; Va,be P:a<bVb<a. 


Note that a total order is a special case of a partial order since, by letting a = 6 in the 
totality condition, we obtain reflexivity. In a totally ordered set, every pair of elements is 
related. 


Definition. Let (P,<) be a partially ordered set and let T C P. An element u € P is said 
to be an upper bound for T if 
VteT: t<u. 


Every single-element subset of a partially ordered set has at least one upper bound by 
reflexivity. However, in general, a subset may not have any upper bound. For example, if 
the subset contains an element which is not related to any other element. 


Definition. Let (P,<) be a partially ordered set. A maximal element of P is an element 
m € P such that 


facP: m<a 


or, alternatively, 
VaeP: (m<a)=>(m=a). 


Note that this is not equivalent to 
VaeP:a<m 
unless (P, <) is a totally ordered set. 
We are now ready to prove the theorem. 
Proof of Theorem 11.9. We will tackle this one step at a time. 


a) Let S C V be a generating set of V, ie. 


VvEV:4de,...,ey €S:4d',...,d% ED: v=de. 
A generating set always exists, as one may take S=V. 


b) Define a partially ordered set (P,<) by 
P :={U € P(S) | U is linearly independent} 


and <:=C, that is, we partial order by set-theoretic inclusion. 


— 93 — 


c) Let T C P be any totally ordered subset of P. Then JT is an upper bound for T, 
and it is linearly independent (by the total ordering assumption). Hence JT € P. 


By Zorn’s lemma, P has a maximal element. Le B € P be any such element. BY 
construction, B is a maximal (with respect to inclusion) linearly independent subset 
of the generating set S. 


d) We now claim that S = spanp(B). Indeed, let v € $\B. Then BN {v} € P(S). Since 


B is maximal, the set BM {v} is not linearly independent. Hence 


Jey,...,en € B:4d,d',...,d% €D: de, + dv =0, 


where the coefficients d,d!,...,d are not all zero. In particular, d 4 0, for if it 
was, it would immediately follow that d*e, = 0 for some d',...,d%, not all zero, 
contradicting the linear independence of B. 


Since D is a division ring and d ¥ 0, there exists a multiplicative inverse for d. Then 
we can multiply both sides of the above equation by d~! € D to obtain 


de ed yes. 
Hence S = spanp(B). 


e) We therefore have 
V = spanp(S) = spanp(B) 


and thus B is a basis of V. 


We stress again that if D is not a division ring, then a D-module may, but need not, 
have a basis. 


11.4 Module constructions and applications 
As for vector spaces, we can perform the usual constructions with modules as well. 


Definition. The direct sum of two R-modules M and N is the R-module M @ N, which 
has M x N as its underlying set and operations (inherited from M and N) defined compo- 
nentwise. 


Note that while we have been using © to temporarily distinguish two “plus-like”’ oper- 
ations in different spaces, the symbol © is the standard notation for the direct sum. 


Definition. An R-module M is said to be 
e finitely generated if it has a finite generating set; 
e free is it has a basis; 
e projective if it is a direct summand of a free R-module F’, i.e. 
MOQ=F 


for some R-module Q. 
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Example 11.13. As we have seen, I'(T'R?) is free while [(T'S7) is not. 
Example 11.14. Clearly, every free module is also projective. 


Definition. Let M and N be two (left) R-modules. A map f: M — N is said to be an 
R-linear map, or an R-module homomorphism, if 


VreER:Vmy,m2€M: f(rm,+me2) =rf(m) + f(ma), 


where it should be clear which operations are in M and which in N. 
A bijective module homomorphism is said to be a module isomorphism, and we write 
M ~moa N if there exists a module isomorphism between them. 


If M and N are right R-modules, then the linearity condition is written as 
VreER:Vmi,m2€M: f(mir+me) = f(mi)r + f(me). 


Proposition 11.15. If a finitely generated module R-module F' is free, and d € N is the 
cardinality of a finite basis, then 


F 2moa= RO-++@ R=: R¢. 
d copies 


One can show that if R¢ “oq R”, then d = d’ and hence, the concept of dimension is 
well-defined for finitely generated, free modules. 


Theorem 11.16 (Serre, Swan, et al.). Let E be a vector fibre bundle over a smooth manifold 
M. Then, the set T(E) of all smooth section of E over M is a finitely generated, projective 
C™*(M)-module. 


A vector fibre bundle is a fibre bundle in which every fibre is a vector space. An example 
is the tangent bundle to a manifold. 
Remark 11.17. An immediate consequence of the theorem is that, for any vector fibre bundle 
E over M, there exists a C°(M)-module Q such that the direct sum ['(£) © Q is free. If 
Q can be chosen to be the trivial module {0}, then I'(£) is itself free, as it is the case with 
T'(TR?). In a sense, the module Q quantifies the failure of T(E) to have a basis. 


Theorem 11.18. Let P,Q be finitely generated (projective) modules over a commutative 
ring R. Then 
Hom,(P, Q) := {¢: P > Q| ¢ is R-linear} 


is again a finitely generated (projective) R-module, with operations defined pointwise. 


The proof is exactly the same as with vector spaces. As an example, we can use this 
to define the dual of a module. For instance 


Home y)(F(7M),C°(M)) =: P(TM)*. 


One can show that [(7’M)* coincides with the smooth sections over the cotangent bundle 
I(T*M), i.e. the covector fields. Recall that, just like a vector field, a covector field is a 
smooth section of the cotangent bundle 7*M, that is, a smooth map w: M > T*M with 
m7 Ow =idy. Unlike what we had with vector fields, we can always define the pull-back of 
a covector field along any smooth map between manifolds. 
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Definition. Let ¢: M — N be smooth and let w € T'(T*N). We define the pull-back 
®*(w) € T(T*M) of w as 


®*(w): M > T*M 
pr ®(w)(p), 
where 
®*(w)(p): Tp M >R 
X + Bw) (p)(X) := w((p))(o(X)), 


as in the following diagram 


TM ~—?__ T*N 
®*(w) w 
M g N 


We can, of course, generalise these ideas by constructing the (r,s) tensor bundle of M 


1M = [J (tf)pM 
pEeM 
and hence define a type (r,s) tensor field on M to be an element of ['(77M), i.e. a smooth 
section of the tensor bundle. If ¢: M — N is smooth, we can define the pull-back of 
contravariant (i.e. type (0,q)) tensor fields by generalising the pull-back of covector fields. 


If ¢: M > N is a diffeomorphism, then we can define the pull-back of any smooth (p, q) 
tensor field 7 € [(T7N) as 


DF) Wavazayliyy Xipine hes) 
= 7((p))((P*)*(@1), «++ (O71) * Wr); ba(X1), 1 Ge(Xs)); 
with w; € TM and X; € T,M. 


There is, however, an equivalent characterisation of tensor fields as genuine multilinear 
maps. This is, in fact, the standard textbook definition. 


Definition. Let M be a smooth manifold. A smooth (r,s) tensor field 7 on M is a 
C*°(M)-multilinear map 


7: T(T*M) x +++ x F(T*M) x F(TM) x ++» x I(TM) 3 C*(M). 
ee 


r copies Ss copies 


The equivalence of this to the bundle definition is due to the pointwise nature of tensors. 
For instance, a covector field w € [(7*M) can act on a vector field X € [(LM) to yield a 
smooth function w(X) € C°(M) by 


(w(X))(p) == w(p)(X(p)). 
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Then, we see that for any f € C°(M), we have 


(w(fX))(p) = w(p)(F(P)X (Pp) = F(p)w(p)(X(p)) = (Fo(X))(p) 


and hence, the map w: (7M) — C®(M) is C°(M)-linear. 

Similarly, the set [(77M) of all (r,s) smooth tensor fields on M can be made into a 
C*°(M)-module, with module operations defined pointwise. 

We can also define the tensor product of tensor fields 


@: T(T?M) x I(T2M) > T(T PYM) 


(T,0) 4 7T@o 
analogously to what we had with tensors on a vector space, i.e. 


(7 @ a) (wr, +++ >Wp,Wp+l1,--- »Wptr, X1,. os Xq, Xq41; oS cree 


= T(w1,. rae Wp, X1,. ae ,Xq) o(Wp+1; tee Wptr, Xq+l)- pe »Xqts), 


with w, €T(T*M) and X; « T(TM). 

Therefore, we can think of tensor fields on M either as sections of some tensor bundle 
on M, that is, as maps assigning to each p € M a tensor (R-multilinear map) on the vector 
space T,M, or as a C°(M)-multilinear map as above. We will always try to pick the most 
useful or easier to understand, based on the context. 
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12 Grassmann algebra and de Rham cohomology 


12.1 Differential forms 
Definition. Let WM be a smooth manifold. A (differential) n-form on M is a (0,n) smooth 


tensor field w which is totally antisymmetric, i.e. 
w(X1,...,Xn) = sgn(m) w(Xa1),--+)Xa(n)) 
for any 7 € Sp, with X; €T(TM). 
Alternatively, we can define a differential form as a smooth section of the appropriate 


bundle on M, i.e. as a map assigning to each p € M an n-form on the vector space 7,M. 


Example 12.1. a) A manifold M is said to be orientable if it admits an oriented atlas, 
i.e. an atlas in which all chart transition maps, which are maps between open subsets 


of Rt™™ have a positive determinant. 


If M is orientable, then there exists a nowhere vanishing top form (n = dim M) on 


M providing the volume. 


b) The electromagnetic field strength F' is a differential 2-form built from the electric 
and magnetic fields, which are also taken to be forms. We will define these later in 


some detail. 


c) In classical mechanics, if Q is a smooth manifold describing the possible system con- 
figurations, then the phase space is T*Q. There exists a canonically defined 2-form 


on T*Q known as a symplectic form, which we will define later. 


If w is an n-form, then n is said to be the degree of w. We denote by 2"(M) the 
set of all differential n-forms on M, which then becomes a C°(M)-module by defining the 


addition and multiplication operations pointwise. 
Example 12.2. Of course, we have 2°(M) = C°(M) and 01(M) =T(T?M) =T(T*M). 


Similarly to the case of forms on vector spaces, we have 0"(M) = {0} for n > dim M, 
and otherwise dim Q"(M) = ean, as a C®(M)-module. 
We can specialise the pull-back of tensors to differential forms. 


Definition. Let ¢: M — N be a smooth map and let w € 2"(N). Then we define the 
pull-back ®*(w) € Q"(M) of w as 


®*(w): M > T*M 
pr> &*(w)(p), 


where 
B*(w)(p)(X1,---, Xn) = w(b(p)) (bx(X1),---,G«(Xn)), 
for X; € T,M. 


— 98 — 


The map ®*: 0"(N) > 0"(M) is R-linear, and its action on 2°(M) is simply 
&*: 2°(M) > 0°(M) 
froO(f):= fog. 
This works for any smooth map 4@, and it leads to a slight modification of our mantra: 


Vectors are pushed forward, 
forms are pulled back. 


The tensor product ® does not interact well with forms, since the tensor product of two 


forms is not necessarily a form. Hence, we define the following. 


Definition. Let M be a smooth manifold. We define the wedge (or ezterior) product of 
forms as the map 
A: O"(M) x Q™(M) = O°t™(M) 
(wro)Hwa, 
where 


1 


(wAo)(X1,.-.,Xntm) = =| 
n!m! 


SS” sgn(t)(w @ o)(Xq1)s-- +s Xa(n-m)) 


TE Snim 


and X1,...,Xnim €I(TM). By convention, for any f,g € 2°(M) and w € 2"(M), we set 


fAg:=fg and friw=auNf = fu. 
Example 12.3. Suppose that w,a¢ € 2!(M). Then, for any X,Y €T(TM) 


(wAo)(X,Y) = (w@o)(X,Y)— (w@a/)(Y,X) 
= (w @®a)(X,Y) —w(VY)o(X) 

= (w @a)(X,Y)-— (co @w)(X,Y) 
=(w@o-—70 @w)(X,Y). 


Hence 
W\od=WGo-daBw. 


The wedge product is bilinear over C°(M), that is 
(fw, +w2)Ao = fur Aogt+uoAa, 


for all f € C°(M), wi, we € N"(M) anda € 2™(M), and similarly for the second argument. 
Remark 12.4. If (U,x) is a chart on M, then every n-form w € "(U) can be expressed 


locally on U as 
=O wig nde Tease: 
where Wa,...a, € C~(U) and 1 < ay < +++ < ap, < dimM. The dx” appearing above are 
the covector fields (1-forms) 
dz™: p++ dpz™. 
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The pull-back distributes over the wedge product. 


Theorem 12.5. Let ¢: M > N be smooth, w€0"(N) anda €N™(N). Then, we have 
B*(wA a) = &*(w) A ®*(o). 
Proof. Let p€ M and X1,...,Xn4m € T>M. Then we have 


(B*(w) A ®*(c)) (p)(X1,.--Xn+m) 


° snl 2. sgn(m)(*(w) @ &*(o))(p)(Xa(a)s-++1 Xm(ntm)) 
-—— d= sgn(m)&*(w)(p)(Xacays---+Xa(n)) 

aearn B*(o)(p) (Xanga) +++» Xn(ntm)) 
= 1 san(aeo((0)) (64(Knay)s--- 4K) 

pecs o(G(D)) (be(Xn(nay)s---b4(Xn(n-tm))) 
== — SS sanlayo 0) 4) (Foe) 8s(%aintm) 


= (w A.o)(6(p)) (b«(X1);---,Ox(Xntm)) 
= B*(wAc)(p)(X1,...,Xn4m)- 


Since p € M was arbitrary, the statement follows. 


12.2. The Grassmann algebra 


Note that the wedge product takes two differential forms and produces a differential form 
of a different type. It would be much nicer to have a space which is closed under the action 
of A. In fact, such a space exists and it is called the Grassmann algebra of M. 


Definition. Let M be a smooth manifold. Define the C°°(M)-module 


dim M 
Gr(M) =9(M):= @B 2"(M). 
n=0 


The Grassmann algebra on M is the algebra (Q(M),+,-,/A), where 
A: Q(M) x Q(M) 9 Q(M) 


is the linear continuation of the previously defined A: Q"(M) x N™(M) 3 O"t™(M). 


Recall that the direct sum of modules has the Cartesian product of the modules as 
underlying set and module operations defined componentwise. Also, note that by “algebra” 
here we really mean “algebra over a module”. 


Example 12.6. Let 7 = w +o, where w € 2'(M) and o € 03(M). Of course, this “+” is 
neither the addition on Q!(M) nor the one on 3(M), but rather that on Q(M) and, in 
fact, p € O(M). 
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Let yp € 2"(M), for some n. Then 
pAv=pA(Wwto)=y"Awt+yrAo, 


where pAw €0"*1(M), pAo € 2"t3(M), and pA EQ(M). 


Example 12.7. There is a lot of talk about Grassmann numbers, particularly in supersym- 
metry. One often hears that these are “numbers that do not commute, but anticommute”. 
Of course, objects cannot be commutative or anticommutative by themselves. These qual- 
ifiers only apply to operations on the objects. In fact, the Grassmann numbers are just the 
elements of a Grassmann algebra. 


The following result is about the anticommutative behaviour of A. 
Theorem 12.8. Let wE"(M) ando €N™"(M). Then 
who =(-1)"™oAw. 


We say that A is graded commutative, that is, it satisfies a version of anticommutativity 
which depends on the degrees of the forms. 


Proof. First note that if w,o € 0'(M), then 
WNho=W®o-aBw=—-ankw. 
Recall that is w € O"(M) and og € N'(M), then locally on a chart (U, x) we can write 


OS Wenge Tea 
c= Ob bm Ae! A+++ A dghm 


with 1 < aj < +--+ < a, < dim™ and similarly for the b;. The coefficients wg,...¢,, and 
Ob, --bm are smooth functions in C~(U). Since dx“, dx’) € Q1(M), we have 


WA 0 = Wagan Fb, bn AE A+++ Ada Ada A... Adabm 


= (-1)" Way.--an0b,--bm Ae"! A dat A+++ Ada Adax® A.» Adam 
(=1)?” Way amy Fb1--bm Ae"! A da”? A dx A---Adx™ Adx®3 A--- Ada’ 


(—1)™ Way an Thy o-by AL"! A+ Adam A da™ A--- A dx™ 
= (-1)""oAw 


since we have swapped 1-forms nm-many times. 


Remark 12.9. We should stress that this is only true when w and a are pure degree forms, 
rather than linear combinations of forms of different degrees. Indeed, if y,w € Q(M), a 
formula like 

prAp=- pry 
does not make sense in principle, because the different parts of y and w can have different 
commutation behaviours. 
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12.3. The exterior derivative 


Recall the definition of the gradient operator at a point p € M. We can extend that 
definition to define the (R-linear) operator: 


d: C®(M) + I(7*M) 
cece 
where, of course, df: p++ d,f. Alternatively, we can think of df as the R-linear map 


df: T(P2M) > c(M) 
X49 df(X) =X(/). 


Remark 12.10. Locally on some chart (U,x) on M, the covector field (or 1-form) df can 
be expressed as 


df =, da" 


for some smooth functions A; € C°(U). To determine what they are, we simply apply both 
sides to the vector fields induced by the chart. We have 


0 0 
as (so) = goal) = af 


and ; j 
Xa ax") = Na 5,0 (") = Xa Op = Xp: 


Hence, the local expression of df on (U, x) is 
dj = o,f di" 
Note that the operator d satisfies the Leibniz rule 
d(fg) =gdf + fg. 
We can also understand this as an operator that takes in 0-forms and outputs 1-forms 
d: 2°(M) => 01(M). 


This can then be extended to an operator which acts on any n-form. We will need the 


following definition. 


Definition. Let M be a smooth manifold and let X,Y € I'(7M). The commutator (or 
Lie bracket) of X and Y is defined as 


[X,Y]: C*(M) — C?(M) 
fr [X,Y] (Ff) = X(VY(f)) - Y(X(P)), 


where we are using the definition of vector fields as R-linear maps C°(M) —> C*(M). 
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Definition. The exterior derivative on M is the R-linear operator 


d: 2"(M) = a"t!(M) 


wre dw 


with dw being defined as 


n+l 
du(X1,...,Xn41) = oe 1)*** Xj (w COG sot Xen nd) 


ee iB az CD, Coe, 61. 6 pre, Con, Caen, Cnt DD 
<j 


where X; € (7M) and the hat denotes omissions. 


Remark 12.11. Note that the operator d is only well-defined when it acts on forms. In order 
to define a derivative operator on general tensors we will need to add extra structure to our 
differentiable manifold. 


Example 12.12. In the case n = 1, the form dw € 2?(M) is given by 
dw(X,Y) = X(w(Y)) — Y(w(X)) — w([X, Y]). 
Let us check that this is indeed a 2-form, i.e. an antisymmetric, C°(M/)-multilinear map 
dw: T'(TM) x T(TM) > C*(M). 
By using the antisymmetry of the Lie bracket, we immediately get 
du(X,Y) = —dw(Y, X). 


Moreover, thanks to this identity, it suffices to check C°(M)-linearity in the first argument 
only. Additivity is easily checked 


duo(X1 + Xo,¥) = (Xi + Xo)(w(¥)) — ¥ (w(%1 + Xo) — w(LXi + X2, YY) 
= Xi (w(¥)) + X2(w(¥)) —¥ w(Xi) + (Xp) — (|X, ¥] + [X2, YI) 
= Xi (w(V)) + Xo(w(¥)) — ¥(w(X1)) — ¥ W(X) — w([X1, ¥]) — w([X, YD) 
= dw(%1, Y) + dw(X2,Y). 


For C~(M)-scaling, first we calculate [f X,Y]. Let g € C@(M). Then 


[X,Y ](g) = FX(¥(9)) — Y¥(FX(g)) 
= FX(Y(g)) — FY(X(9)) — Y¥(P)X@) 
= F(X(¥(9)) — ¥(X(9))) — YA X(g) 
=i 
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Therefore 


Hence, we can calculate 


du(fX,Y) = fX(w(V)) — Yw(fX)) — w([FX, Y]) 
= fX(w(Y)) — Y(fu(X)) — w(f[X, Y] -— Y(f)X) 
= fX(w(Y)) — FY (@(X)) — Y(f)w(X) — fw X, Y]) +o (f)X) 
= fX(w(¥)) — FY @(X)) — YPeo(X) — fo((X, Y])) + ¥Yo(X) 
= fdw(X,Y), 


which is what we wanted. 


The exterior derivative satisfies a graded version of the Leibniz rule with respect to the 
wedge product. 


Theorem 12.13. Let w €2"(M) anda €2™(M). Then 
d(wAo)=dwAo+4+(-1)"wAdo. 
Proof. We will work in local coordinates. Let (U,x) be a chart on M and write 
= Weta da hee fede = wadr4 
foe Ob, by, Av? A+++ A dg? =: opdex?. 
Locally, the exterior derivative operator d acts as 
dw = dw, A dx4. 
Hence 
d(w A co) = d(waop da“ A dx?) 
= d(waog) A dax4 A da? 
= (opdw, + wadopg) A dx4 A dx? 
opdwa, /\ dx4 A dx? + wadop A drt A dx? 
=opdws \dx4 A dx? + (—1)"’wada4 A dap A dx? 


= opdw Adz? + (-1)"wada4 A dao 
=dwAoa+(-1)"wAdo 


since we have “anticommuted” the 1-form dog through the n-form dx“, picking up n minus 


signs in the process. 
An important property of the exterior derivative is the following. 


Theorem 12.14. Let 6: M > N be smooth. For any w € 2"(N), we have 


}* (dw) = d(®*(w)). 
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Proof (sketch). We first show that this holds for 0-forms (i.e. smooth functions). 
Let f €C™(N), pe M and X € T,M. Then 


&*(df)(p)(X) = Af(6(p))(O(X)) (definition of &* 


) 
) 


( 
= ¢,(X)(f) (definition of df 
= X(fod¢) (definition of ,) 
= d(f o d)(p)(X) (definition of d(f o ¢)) 
= d(®*(f))(p)(X) (definition of *), 


so that we have ®*(df) = d(®*(f)). 
The general result follows from the linearity of ®* and the fact that the pull-back 


distributes over the wedge product. 


Remark 12.15. Informally, we can write this result as ®*d = d®*, and say that the exterior 
derivative “commutes” with the pull-back. 

However, you should bear in mind that the two d’s appearing in the statement are 
two different operators. On the left hand side, it is d: Q"(N) > Q”"*1(N), while it is 
d: 0"(M) > 2"*!(M) on the right hand side. 


Remark 12.16. Of course, we could also combine the operators d into a single operator 
acting on the Grassmann algebra on M 


d: Q(M) > Q(M) 


by linear continuation. 


Example 12.17. In the modern formulation of Maxwell’s electrodynamics, the electric and 
magnetic fields E and B are taken to be a 1-form and a 2-form on R°, respectively: 


E := E,dz + Eydy + E,dz 
B:= B,dy A dz + Bydz A dz + B,dzx A dy. 


The electromagnetic field strength F is then defined as the 2-form on R4 
F:=B+EAdt. 


In components, we can write 
P= Fy de Nas, 


where (dx°, da!, da”, dx?) = (dt, da, dy, dz) and 


The field strength satisfies the equation 


dF =0. 
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This is called the homogeneous Maxwell’s equation and it is, in fact, equivalent to the two 


homogeneous Maxwell’s (vectorial) equations 


V-B=0 
OB 
Vx E+—=0. 
ot 
In order to cast the remaining Maxwell’s equations into the language of differential forms, 
we need a further operation on forms, called the Hodge star operator. 
Recall from the standard theory of electrodynamics that the two equations above imply 


the existence of the electric and vector potentials y and A = (Az, Ay, Az), satisfying 


B=VxA 
OA 
B= Vveo——. 
PB 


Similarly, the equation dF = 0 on R* implies the existence of an electromagnetic 4-potential 
(or gauge potential) form A € 91(R*) such that 


F=daA. 


Indeed, we can take 


A:=—pdt+ A,dx + Aydy + Azdz. 


Definition. Let M be a smooth manifold. A 2-form w € ?(M) is said to be a symplectic 


form on M if dw = 0 and if it is non-degenerate, i.e. 
(VY EIT(TM) : (X,Y) =0) > X =0. 
A manifold equipped with a symplectic form is called a symplectic manifold. 


Example 12.18. In the Hamiltonian formulation of classical mechanics one is especially 
interested in the cotangent bundle 7*Q of some configuration space Q. Similarly to what 
we did when we introduced the tangent bundle, we can define (at least locally) a system of 


coordinates on T*Q by 
dim Q 


(@? px0i9  Divorig tdi); 


where the p;’s are the generalised momenta on Q and the q’’s are the generalised coordinates 


on Q (recall that dim 7*Q = 2dimQ). We can then define a 1-form 6 € 0!(T*Q) by 
0 = pidq' 
called the symplectic potential. If we further define 
w := dd € 0?(T*Q), 


then we can calculate that 


dw = d(d@) =---=0. 


Moreover, w is non-degenerate and hence it is a symplectic form on T*Q. 
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12.4 de Rham cohomology 


The last two examples suggest two possible implications. In the electrodynamics example, 


we saw that 


(dF =0) > (4A: F =dA), 


while in the Hamiltonian mechanics example we saw that 


(Gs w= dey = (da =0). 
Definition. Let M be a smooth manifold and let w € 2"(M). We say that w is 


e closed if dw = 0; 


e exact if ia €2"-1(M) :w=do. 


The question of whether every closed form is exact and vice versa, i.e. whether the 


implications 


(dw = 0) & (da: w =do) 


hold in general, belongs to the branch of mathematics called cohomology theory, to which 
we will now provide an introduction. 
The answer for the < direction is affirmative thanks to the following result. 


Theorem 12.19. Let M be a smooth manifold. The operator 
d? = dod: 9"(M) = Q"*?(M) 
is identically zero, i.e. d? = 0. 
For the proof, we will need the following concepts. 


Definition. Given an object which carries some indices, say To, ,...a,, we define the anti- 


Losers 
symmetrization of Ta, an a8 


1 
Tay-an] = >, s8n(t) Tr(ar)--m(an): 
; TES 
Similarly, the symmetrization of To, ,....a, is defined as 
su fe 
Taran) =D, Ta(as)~-m(an): 
TES 


Some special cases are 


1 1 
Tha] = 5 (Lab — Toa), Tab) = 5 (Las + Tra) 


Tabe] = ~(Tabe + Thea + Tear — Trac — Toba Tacb) 


— 


ep) 


T (abe) — ~(Tabe + Thea + Tear + Thac + Toba + Tae) 


D 
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Of course, we can (anti)symmetrize only some of the indices 


Tas == (re = ab \: 


cde dce 


1 
2 
It is easy to check that in a contraction (i.e. a sum), we have 


Doin alae = ihe 


aya 
1 [ai-aj]~-an 5 ‘ 


and 


FR =/(): 


Tay -(aj---a5) 


Proof. This can be shown directly using the definition of d. Here, we will instead show it 
by working in local coordinates. 
Recall that, locally on a chart (U,x), we can write any form w € 2"(M) as 


W= Weg cde Ande, 
Then, we have 


deo = dda; sg, Nida AeA de 
= OyWar a," Ada A+++ A dx, 


and hence 
Pw = OpOptteg ta," A dx’ A dx™ A--- A dx®™. 


Since dx® A dx’ = — dx? A dx°, we have 
dx® A dx® = dal’ a del, 
Moreover, by Schwarz’s theorem, we have O-OpWay---an = ObOcWa;--ay, and hence 
OcOpWay--an = O(cOp)War---an- 
Thus 


ew = Og Opting en, DE" dx?’ \ dx™ A+++ A dx™ 
= (cD) Way--an da? Adal A dx™ A+» Ada 
= 0. 


Since this holds for any w, we have d? = 0. 


Corollary 12.20. Every exact form is closed. 


We can extend the action of d to the zero vector space 0 := {0} by mapping the zero 
in 0 to the zero function in 0°(M). In this way, we obtain the chain of R-linear maps 


OSs OC) 2.) Ss So I a Se Ss 0) 
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where we now think of the spaces 2"(1/) as R-vector spaces. Recall from linear algebra 
that, given a linear map ¢: V > W, one can define the subspace of V 


ker(¢) = {v € V | o(v) = Of, 


called the kernel of ¢, and the subspace of W 


im(¢) = {9(v) |v eV}, 


called the image of ¢. 
Going back to our chain of maps, the equation d? = 0 is equivalent to 


im(d: O"(M) 3 2°*1(M)) C ker(d: 2"*1(M) > Q"+2(M)) 
for all 0 <n < dim M — 2. Moreover, we have 
w €0"(M) is closed & w € ker(d: Q"(M) > 2"T1(M)) 
w €0"(M) is exact & w € im(d: N"-1(M) = 2"(M)). 
The traditional notation for the spaces on the right hand side above is 


Z” := ker(d: 2"(M) = O"*1(M)), 
B” := im(d: O"-1(M) 3 N"(M)), 


so that Z” is the space of closed n-forms and B” is the space of exact n-forms. 

Our original question can be restated as: does Z” = B” for all n? We have already 
seen that d? = 0 implies that B” C Z” for all n (B” is, in fact, a vector subspace of Z”). 
Unfortunately the equality does not hold in general, but we do have the following result. 


Lemma 12.21 (Poincaré). Let M C R¢ be a simply connected domain. Then 
Z° = B", Yn > 0. 


In the cases where Z” 4 B”, we would like to quantify by how much the closed n-forms 
fail to be exact. The answer is provided by the cohomology group. 


Definition. Let M be a smooth manifold. The n-th de Rham cohomology group on M is 
the quotient R-vector space 


AM sn Be 
You can think of the above quotient as Z”/~, where ~ is the equivalence relation 
wro > w-—ae€ B". 


The answer to our question as it is addressed in cohomology theory is: every exact n-form 
on M is also closed and vice versa if, only if, 


H"(M) ~vec 0. 
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Of course, rather than an actual answer, this is yet another restatement of the question. 
However, if we are able to determine the spaces H"(M), then we do get an answer. 

A crucial theorem by de Rham states (in more technical terms) that H"(M) only 
depends on the global topology of M. In other words, the cohomology groups are topological 
invariants. This is remarkable because H"(M) is defined in terms of exterior derivatives, 
which have everything to do with the local differentiable structure of M, and a given 


topological space can be equipped with several inequivalent differentiable structures. 


Example 12.22. Let M be any smooth manifold. We have 


0 of connected components of M 
H°(M) vec R“* P ) 


since the closed 0-forms are just the locally constant smooth functions on M. As an 


immediate consequence, we have 
H(R) 2 yee H?(S") Syec R: 
Example 12.23. By Poincaré lemma, we have 
H"(M) =vec 0 


for any simply connected M C R?. 


—110- 


13 Lie groups and their Lie algebras 
Lie theory is a topic of the utmost importance in both physics and differential geometry. 


13.1 Lie groups 


Definition. A Lie group is a group (G,e), where G is a smooth manifold and the maps 


pb: GxGoG 
(91, 92) > gi @ g2 


and 


i:G—7oG 


greg” 


are both smooth. Note that G x G inherits a smooth atlas from the smooth atlas of G. 
Definition. The dimension of a Lie group (G,e) is the dimension of G as a manifold. 


Example 13.1. a) Consider (R",+), where R” is understood as a smooth n-dimensional 
manifold. This is a commutative (or abelian) Lie group (since e is commutative), 
often called the n-dimensional translation group. 


b) Let St := {z €C | |z| =1} and let - be the usual multiplication of complex numbers. 
Then (S1,-) is a commutative Lie group usually denoted U(1). 


c) Let GL(n,R) = {¢: R” — R” | detd # 0}. This set can be endowed with the 
structure of a smooth n?-dimensional manifold, by noting that there is a bijection 
between linear maps ¢: R” > R” and R?”. The condition det ¢ 4 0 is a so-called 
open condition, meaning that GL(n,R) can be identified with an open subset of R?”, 
from which it then inherits a smooth structure. 


Then, (GL(n,R), 0°) is a Lie group called the general linear group. 
d) Let V be an n-dimensional R-vector space equipped with a pseudo inner product, i.e. 
an bilinear map (—,—): V x V > R satisfying 
i) symmetry: Vv,weV: (v,w) = (w,v); 
ii) non-degeneracy: (Vw € V: (v,w) =0) > v=0. 
Ordinary inner products satisfy a stronger condition than non-degeneracy, called pos- 
itive definiteness, which is Vu € V: (v,v) > 0 and (v,v) =0S>v=0. 


Given a symmetric bilinear map (—,—) on V, there is always a basis {e,} of V such 


that (€a,€a) = +1 and zero otherwise. If we get p-many 1s and q-many —1s (with 
p+q=n, of course), then the pair (p,q) is called the signature of the map. Positive 
definiteness is the requirement that the signature be (n,0), although in relativity we 
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require the signature to be (n— 1,1). A theorem states that there are (up to isomor- 


phism) only as many pseudo inner products on V as there are different signatures. 


We can define the set 


O(p,¢) = {6: V + V | Vu,w eV: (dv), o(w)) = (v,w)}. 


The pair (O(p,q),°) is a Lie group called the orthogonal group with respect to the 
pseudo inner product (—,—). This is, in fact, a Lie subgroup of GL(p + q,R). Some 
notable examples are O(3,1), which is known as the Lorentz group in relativity, and 
O(3,0), which is the 3-dimensional rotation group. 


Definition. Let (G,e) and (H,°) be Lie groups. A map ¢: G > H is Lie group homo- 
morphism if it is a group homomorphism and a smooth map. 


A Lie group isomorphism is a group homomorphism which is also a diffeomorphism. 


13.2 The left translation map 


To every element of a Lie group there is associated a special map. Note that everything we 
will do here can be done equivalently by using right translation maps. 


Definition. Let (G,e) be a Lie group and let g € G. The map 
lg: GG 
hrs €g(h) :=geh= gh 
is called the left translation by g. 
If there is no danger of confusion, we usually suppress the e notation. 


Proposition 13.2. Let G be a Lie group. For any g € G, the left translation map tg: G — 
G is a diffeomorphism. 


Proof. Let h,h’ € G. Then, we have 


e(h) =0,(h') & gh=gh' @ han’. 


Moreover, for any h € G, we have g-'h € G and 
lg(g-*h) =gg th=h. 
Therefore, ¢g is a bijection on G. Note that 
fg = u(g,—) 


and since pz: G x G — G is smooth by definition, so is Cy. 
The inverse map is (€g)~' = @,-1, since 


he 10lg =lgok, 1 =idg. 


Then, for the same reason as above with g replaced by g~!, the inverse map (fasn7 is also 


smooth. Hence, the map f, is indeed a diffeomorphism. 
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Note that, in general, @, is not an isomorphism of groups, i.e. 
lg(hh!) # lg(h) bh’) 
in general. However, as the final part of the previous proof suggests, we do have 
lg 0 Ly = Loh 


for all g hE G. 


Since ¢j: G > G is a diffeomorphism, we have a well-defined push-forward map 


(Hi ereers rare) 
X ++ (Lg)x(X) 


where 


Cee re 
he (Lg)a(X)(h) == (€y)«(X(g7 A). 


We can draw the diagram 


TG —“"_, rE 


1 Joss 


Gea 3G 


Note that this is exactly the same as our previous 
(0) = g,0006'. 
By introducing the notation X |, := X(h), so that X|;, € T;,G, we can write 
(Lg)«(X)|h = (€g)«(X|g-1n)- 


Alternatively, recalling that the map @, is a diffeomorphism and relabelling the elements of 
G, we can write this as 


(Lg)*(X)| gn = (£5) «(X|n)- 


A further reformulation comes from considering the vector field X € I'(7TG) as an R-linear 


map X:C~(G) — C®(G). Then, for any f € C~(G) 


(Lg)x(X)(f) = X(f 0 ly). 


Proposition 13.3. Let G be a Lie group. For any g,h € G, we have 


(Lg) © (Ln)x = (Lgn)«- 
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Proof. Let f € C°(G). Then, we have 


(Lg) © (Ln)x)(X)(F) = (Ly) ((LZn)«(X)) (A) 
= (Ln)x(X)(F 0 &g) 
= X(folgoln) 
= X(folgn) 
=: (Lgn)«(X)(f), 


as we wanted to show. 


The previous identity applies to the pointwise push-forward as well, i.e. 
((€g1)s ° (lg2)«) (Xn) = (Co; as )x(X |p) 
for any g1,92,h € Gand X|p, € ThG. 


13.3. The Lie algebra of a Lie group 


In Lie theory, we are typically not interested in general vector fields, but rather on spe- 
cial class of vector fields which are invariant under the induced push-forward of the left 
translation maps Cy. 


Definition. Let G be a Lie group. A vector field X € I'(TG) is said to be left-invariant if 
VoEGs (bg)e GQ) =X. 
Equivalently, we can require this to hold pointwise 
Vg,heG: (g)x(Xla) = X|gn- 


By recalling the last reformulation of the push-forward, we have that X € I'(TG) is 
left-invariant if, and only if 


VF ECG): X(foly) =X(f) oly. 
We denote the set of all left-invariant vector fields on G as L(G). Of course, 
L(G) CT (TG) 
but, in fact, more is true. One can check that £(G) is closed under 
+: £(G) x L(G) > L(G) 
- €°(G) x L(G) > L(G), 


only for the constant functions in C°(G). Therefore, £(G) is not a C°(G)-submodule of 
I(TG), but it is an R-vector subspace of ['(T'G). 

Recall that, up to now, we have refrained from thinking of '(77G) as an R-vector space 
since it is infinite-dimensional and, even worse, a basis is in general uncountable. A priori, 
this could be true for £(G) as well, but we will see that the situation is, in fact, much nicer 
as £(G) will turn out to be a finite-dimensional vector space over R. 
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Theorem 13.4. Let G be a Lie group with identity element e€ G. Then L(G) vec TeG. 
Proof. We will construct a linear isomorphism j: T.G —> L(G). Define 
j: TeG > T(TG) 
Av j(A), 
where 
j(A): G53 TG 
g > J(A)lq = (lg) (A). 


i) First, we show that for any A € T.G, j(A) is a smooth vector field on G. It suffices 
to check that for any f € C°(G), we have j(A)(f) € C*(G). Indeed 


(HAM(P))(9) = JAD GF) 
= (fg)«(A)(f) 
=A({ ot,) 
= (f of,07)'(0), 


where ¥ is a curve through e € G whose tangent vector at e is A. The map 


yp: RxGoR 
(t,9) > v(t, g) = (f of 0 7) (t) 
= f(gy(t)) 


is a composition of smooth maps, hence it is smooth. Then 
(J(A)(F))(9) = (A) (0, 9) 
depends smoothly on g and thus 7(A)(f) € C®(G). 


ii) Let g,h € G. Then, for every A € TG, we have 


(€g)«(9(A) Ia) = 


so j(A) € L(G). Hence, the map j is really 7: T.-G > L(G). 
iii) Let A, BE T.G and \ ER. Then, for any g EG 


j(AA + B)|g = (€g)x(AA + B) 
= A(lg)x(A) + (€y)«(B) 
= rj(Alg + i(Blg 


since the push-forward is an R-linear map. Hence, we have j: T.G —> L(G). 
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iv) Let A,B ET.G. Then 


H(A) = ((B) @ VG EG: F(A)lg = (Bg 
= j(A)le = i(B)le 
> (€e)x(A) = (¢e)«(B) 
=A=B, 


since (¢.), = idrg. Hence, the map j is injective. 
v) Let X € L(G). Define A* := X|. € T-G. Then, we have 
HA og = (£g)(A*) = (lg)s(X le) = Xge = Xq, 


since X is left-invariant. Hence X = j(A*) and thus j is surjective. 


Therefore, 7: T-G —> L(G) is indeed a linear isomorphism. 


Corollary 13.5. The space L(G) is finite-dimensional and dim L(G) = dimG. 


We will soon see that the identification of £(G) and TG goes beyond the level of 
linear isomorphism, as they are isomorphic as Lie algebras. Recall that a Lie algebra over 
an algebraic field K is a vector space over K equipped with a Lie bracket [—,—], ie. a 
K-bilinear, antisymmetric map which satisfies the Jacobi identity. 

Given X,Y € I(7T'M), we defined their Lie bracket, or commutator, as 


LX, YIP) = X(Y(f)) — Y¥(X(P)) 


for any f € C~(M). You can check that indeed [X,Y] € [(7M), and that the bracket is 
R-bilinear, antisymmetric and satisfies the Jacobi identity. Thus, (I(7M),+,-,[-,—]) is 
an infinite-dimensional Lie algebra over R. We suppress the + and - when they are clear 
from the context. In the case of a manifold that is also a Lie group, we have the following. 


Theorem 13.6. Let G be a Lie group. Then L(G) is a Lie subalgebra of 1(TG). 


Proof. A Lie subalgebra of a Lie algebra is simply a vector subspace which is closed under 
the action of the Lie bracket. Therefore, we only need to check that 


VX,Y €E L(G): [X,Y] € L(G). 


Let X,Y € L(G). For any g € G and f € C™(G), we have 


[X,Y](f og) = X(¥ (fo ly)) — Y(X(f 0 fy) 
= X(¥(f) 0b) — ¥(X(f) oly) 
= X(¥(f)) 0b —Y(X(f)) 0 lg 
= (X(Y(f)) -Y(X(f))) 0% 


Hence, [X, Y] is left-invariant. 
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Definition. Let G be a Lie group. The associated Lie algebra of G is L(G). 


Note £(G) is a rather complicated object, since its elements are vector fields, hence we 
would like to work with T.G instead, whose elements are tangent vectors. Indeed, we can 
use the bracket on L(G) to define a bracket on T-G such that they be isomorphic as Lie 
algebras. First, let us define the isomorphism of Lie algebras. 


Definition. Let (Zi, |—,—]z,) and (Le, [—,—]z,) be Lie algebras over the same field. A 
linear map ¢: Ly — Lz is a Lie algebra homomorphism if 


Va,yEli: O([z, yz.) = [¢(z), 6) |r. 
If ¢ is bijective, then it is a Lie algebra isomorphism and we write Li =riealg Le. 


By using the bracket [—, —]¢(q) on L(G) we can define, for any A,B € T.G 


[A, Blng = 5" (iA), i (Blea); 
where j-'(X) = X|¢. Equipped with these brackets, we have 


L(G) =hLie alg TeG. 
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14 Classification of Lie algebras and Dynkin diagrams 


Given a Lie group, we have seen how we can construct a Lie algebra as the space of left- 
invariant vector fields or, equivalently, tangent vectors at the identity. We will later explore 
the opposite direction, i.e. given a Lie algebra, we will see how to construct a Lie group 
whose associated Lie algebra is the one we started from. 

However, here we will consider a question that is independent of where a Lie algebra 


comes from, namely that of the classification of Lie algebras. 


14.1 Lie algebras 


While it is possible to classify Lie algebras more generally, we will only consider the classi- 
fication of finite-dimensional complex Lie algebras, i.e. Lie algebras (L,[—,—]) where L is 


a finite-dimensional C-vector space. 


Example 14.1. Of course, any complex Lie group G (where G is a complex manifold) gives 


rise to a complex Lie algebra. 


If A, B are Lie subalgebras of a Lie algebra (L, |—,—]) over K, then 
[A, B] := spanx ({[z,y] € L | x € A and y € B}) 
is again a Lie subalgebra of L. 
Definition. A Lie algebra LF is said to be abelian if 
Va,yeL: [x,y] =0. 
Equivalently, [Z, LZ] = 0, where 0 denotes the trivial Lie algebra {0}. 


Abelian Lie algebras are highly non-interesting as Lie algebras: since the bracket is 
identically zero, it may as well not be there. Even from the classification point of view, 
the vanishing of the bracket implies that, given any two abelian Lie algebras, every lin- 
ear isomorphism between their underlying vector spaces is automatically a Lie algebra 
isomorphism. Therefore, for each n € N, there is (up to isomorphism) only one abelian 


n-dimensional Lie algebra. 
Definition. An ideal I of a Lie algebra L is a Lie subalgebra such that [J, L] C J, i.e. 
Vael:VyeL: {ay el. 
The ideals 0 and LF are called the trivial ideals of L. 
Definition. A Lie algebra L is said to be 
e simple if it is non-abelian and it contains no non-trivial ideals; 
e semi-simple if it contains no non-trivial abelian ideals. 


Remark 14.2. Note that any simple Lie algebra is also semi-simple. The requirement that 
a simple Lie algebra be non-abelian is due to the 1-dimensional abelian Lie algebra, which 
would otherwise be the only simple Lie algebra which is not semi-simple. 
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Definition. Let L be a Lie algebra. The Lie subalgebra 
pau oe 8 
is called the derived subalgebra of L. 


We can form a sequence of Lie subalgebras 


Gio Sa eee as 


called the derived series of L. 
Definition. A Lie algebra L is solvable if there exists k € N such that L“*) = 0. 


Recall that the direct sum of vector spaces V @ W has V x W as its underlying set and 
operations defined componentwise. 


Definition. Let Ly; and Lz be Lie algebras. The direct sum Li @zie Lz has Ly 6 Le as its 
underlying vector space and Lie bracket defined as 


[v1 + 22, y1 + Yolt@riels (= [21, yilz, + [€2, yal. 


for all x1,y, € Ly and 22,yo € Le. Alternatively, by identifying LD, and Lz with the 
subspaces Ly 60 and 06 L of L, & Le respectively, we require 


[L1, Lo) rr @rieL2 =0. 


In the following, we will drop the “Lie” subscript and understand © to mean @y jie whenever 
the summands are Lie algebras. 


There is a weaker notion than the direct sum, defined only for Lie algebras. 


Definition. Let R and L be Lie algebras. The semi-direct sum RO, L has R© L as its 
underlying vector space and Lie bracket satisfying 


[R, L]re.t CR, 
i.e. R is an ideal of RG, L. 
We are now ready to state Levi’s decomposition theorem. 


Theorem 14.3 (Levi). Any finite-dimensional complex Lie algebra L can be decomposed 
as 
L=R®s (Li ®--: @ In) 


where R is a solvable Lie algebra and L1,..., Ly are simple Lie algebras. 


As of today, no general classification of solvable Lie algebras is known, except for some 
special cases (e.g. in low dimensions). In contrast, the finite dimensional, simple, complex 
Lie algebras have been classified completely. 


Proposition 14.4. A Lie algebra is semi-simple if, and only if, it can be expressed as a 
direct sum of simple Lie algebras. 


Hence, the simple Lie algebras are the basic building blocks from which one can build 
any semi-simple Lie algebra. Then, by Levi’s theorem, the classification of simple Lie 
algebras easily extends to a classification of all semi-simple Lie algebras. 
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14.2 The adjoint map and the Killing form 


Definition. Let L be a Lie algebra over k and let « € L. The adjoint map with respect to 


x is the K-linear map 
ad;: Lb > L 
y ++ ade(y) = [x,y]. 


The linearity of ad, follows from the linearity of the bracket in the second argument, 


while the linearity in the first argument of the bracket implies that the map 
ad: L ~ End(L) 
xr ad(x) :=ad,. 
itself is also linear. In fact, more is true. Recall that End(Z) is a Lie algebra with bracket 
[¢,¥] = gop—wpod. 
Then, we have the following. 
Proposition 14.5. The map ad: L ~ End(L) is a Lie algebra homomorphism. 


Proof. It remains to check that ad preserves the brackets. Let x,y,z € L. Then 


adizy)(z) := [[z, yl, 2] (definition of ad) 
= —|[y, z], 2] — |[z, 2], y] (Jacobi’s identity) 
= [x, [y, 2]] — ly, [z, 2] (anti-symmetry) 


= ade(ady(z)) — ady(ady (2z)) 
= (ad; oad, — ady o adz)(z) 
= [ad,z, ad,](z). 


Hence, we have ad((z, y]) = [ad(x), ad(y)]. 


Definition. Let L be a Lie algebra over K. The Killing form on L is the K-bilinear map 


Ki DLDxLok 
(x,y) +> K(x, y) := tr(adz oady), 


where tr is the usual trace on the vector space End(L). 


Note that the Killing form is not a “form” in the sense that we defined previously. In 


fact, since LD is finite-dimensional, the trace is cyclic and thus « is symmetric, i.e. 
Va,yeL: K(x,y) =K(y,2). 


An important property of « is its associativity with respect to the bracket. 
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Proposition 14.6. Let L be a Lie algebra. For any x,y,z € L, we have 


K([2,y], 2) = w(2, [y, 2)). 


Proof. This follows easily from the fact that ad is a homomorphism. 


adz oad, oad,) — tr(ad, oad, 0 ad,) 
ad; oady oad,) — tr(adz oad, oady) 


: K(2, [y, 2]), 


where we used the cyclicity of the trace. 


We can use « to give a further equivalent characterisation of semi-simplicity. 


Proposition 14.7 (Cartan’s criterion). A Lie algebra L is semi-simple if, and only if, the 
Killing form « is non-degenerate, i.e. 


(VyEL:xK(2,y) =0)S>2r=0. 


Hence, if L is semi-simple, then « is a pseudo inner product on L. Recall the following 
definition from linear algebra. 


Definition. A linear map ¢: V => V is said to be symmetric with respect to the pseudo 
inner product B(—,—) on V if 


Vu,weV: B(d(v),w) = Biv, d(w)). 


If, instead, we have 


Vu,w EV: B(d(v), w) = —B(v, o(w)), 
then ¢ is said to be anti-symmetric with respect to B. 


The associativity property of « with respect to the bracket can be restated by saying 
that, for any z € L, the linear map ad, is anti-symmetric with respect to k, ie. 


Va,yeL: K(ad,(x),y) = —K(x,adz(y)). 


In order to do computations, it is useful to introduce a basis {F;} on L. 


S2he= 


Definition. Let L be a Lie algebra over K and let {F;} be a basis. Then, we have 

[Ei, Ej] = C*,;,Ex 
for some Ce. we Kk. The numbers ae j are called the structure constants of L with respect 
to the basis {E;}. 


In terms of the structure constants, the anti-symmetry of the Lie bracket reads 


while the Jacobi identity becomes 


ei je + Om ns + Oem Onag = 0. 


We can now express both the adjoint maps and the Killing form in terms of components 


with respect to a basis. 

Proposition 14.8. Let L be a Lie algebra and let {E;} be a basis. Then 
i) (adp,)*; = C*, 
i) ag = ORO im 

where Ck. are the structure constants of L with respect to {Ej}. 


Proof. i) Denote by {e*} the dual basis to {£;}. Then, we have 


since e*(Em) = 6*,. 


ii) Recall from linear algebra that if V is finite-dimensional, for any ¢ € End(V) we have 
tr(¢) = or. where ® is the matrix representing the linear map in any basis. Also, 
recall that the matrix representing ¢o wW is the product ®W. Using these, we have 


Ka K( Eg Es) 
= tr(adp, oadg, ) 
= (adp, oadg,)*;, 
= (adz;)4(ade, )*m 


= m k 
=C a8) jm? 


where we used the same notation for the linear maps and their matrices. 
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14.3. The fundamental roots and the Weyl group 


We will now focus on finite-dimensional semi-simple complex Lie algebras, whose classifi- 


cation hinges on the existence of a special type of subalgebra. 


Definition. Let L be a d-dimensional Lie algebra. A Cartan subalgebra H of L is a 
maximal Lie subalgebra of L with the following property: there exists a basis {h1,...,h,;} 
of H which can be extended to a basis {h1,..., hy, €1,.--,€a—r} of L such that e1,...,eq_>, 
are eigenvectors of ad(h) for any h € H, ice. 


Vhe H:5Ag(h) EC: ad(hjea = rAa(h)ea, 
for each 1 <a<d-r. 
The basis {h1,...,hr,e1,..-,€a—r} is known as a Cartan-Weyl basis of L. 
Theorem 14.9. Let L be a finite-dimensional semi-simple complex Lie algebra. Then 
i) L possesses a Cartan subalgebra; 
i) all Cartan subalgebras of L have the same dimension, called the rank of L; 
iii) any of Cartan subalgebra of L is abelian. 


Note that we can think of the Ag appearing above as a map Ag: H — C. Moreover, 
for any z € C and h,h’ € H, we have 


Aal(zh + h’)eg = ad(zh + h')eg 
= [zh +h’, ea] 
= 2[h, eq] + [h', ea] 
= 2ra(h)ea + Aa(h’ ea 
= (2ro(h) + Aa(h’) ea, 
Hence Aq is a C-linear map Ay: H —> C, and thus Ay € H*. 


Definition. The maps Aj,...,Ag_, € H* are called the roots of L. The collection 
®:={rA,|l<a<d-r}C HM 
is called the root set of L. 


One can show that if Aa were the zero map, then we would have eg € H. Thus, we 
must have 0 ¢ ®. Note that a consequence of the anti-symmetry of each ad(h) with respect 
to the Killing form « is that 

AES > -A€ SE. 


Hence ® is not a linearly independent subset of H*. 


Definition. A set of fundamental roots II := {m1,...,7f} is a subset H C ® such that 
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a) IL is a linearly independent subset of H*; 


b) for each A € ®, there exist n1,...,n¢ € N and e € {+1, —1} such that 


f 
A=E So nami. 
i=1 


We can write the last equation more concisely as A € span, y(II). Observe that, for 
any A € ®, the coefficients of 71,..., 7, in the expansion above always have the same sign. 
Indeed, we have span, y(II) 4 spanz (II). 


Theorem 14.10. Let L be a finite-dimensional semi-simple complex Lie algebra. Then 
i) a set Il C ® of fundamental roots always exists; 
i) we have spanc (Il) = H*, that is, Il ts a basis of H*. 


Corollary 14.11. We have || =r, where r is the rank of L. 


Proof. Since II is a basis, |II| = dim H* = dim H =r. 


We would now like to use « to define a pseudo inner product on H*. We know from 
linear algebra that a pseudo inner product B(—,—) on a finite-dimensional vector space V 
over K induces a linear isomorphism 


i:V3V* 
v + i(v) := Biv,—) 
which can be used to define a pseudo inner product B*(—,—) on V* as 
BY: V*xV* 5K 
(9b) > BY(6,¥) = BN), 2H). 


We would like to apply this to the restriction of « to the Cartan subalgebra. However, 
a pseudo inner product on a vector space is not necessarily a pseudo inner product on a 
subspace, since the non-degeneracy condition may fail when considered on a subspace. 


Proposition 14.12. The restriction of « to H is a pseudo inner product on H. 


Proof. Bilinearity and symmetry are automatically satisfied. It remains to show that & is 


non-degenerate on H. 


i) Let {hy,..., hy, er4i,---,ea} be a Cartan-Weyl basis of L and let Ag € ®. Then 


Aa(hj)K(hi, ea) = K(hj, Aa (h g)e a) 


( 
= (hi; [hj, ea) 
= K([hi, hj], €a) 
= «(0, ea) 
= 0. 
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Since Aq # 0, there is some h; such that A_(hj) A 0 and hence 
K(hgee) = 0. 
By linearity, we have K(h,e,) = 0 for any h € H and any ég. 
ii) Leth € H CL. Since & is non-degenerate on L, we have 
(Vz EL: x(h,xz)=0) >h=0. 
Expand x € L in the Cartan-Wey] basis as 
r=h't+e 

where h! := x'h,; and e := x%e,. Then, we have 

K(h, x) = K(h,h’) + 2%K(h, eg) = K(h,h’). 
Thus, the non-degeneracy condition reads 


Vet skh) =O Sn= 0, 


which is what we wanted. 


We can now define 
kK: H* x H* oC 
(1,7) + K* (u,v) = Ki *(u),t-*(V)), 


where i: H ~> H* is the linear isomorphism induced by k. 


Remark 14.13. If {h;} is a basis of H, the components of «* with respect to the dual basis 
satisfy 
(n°) Rik = Of 
Hence, we can write 
K (u,v) = (K*) pany, 
where pi; := p(hi). 


We now turn our attention to the real subalgebra Hg := spang(II). Note that we have 
the following chain of inclusions 


II C ® C span, y(I1) C spang(II) C spang(II). 
—_————_ Co 


Hx H* 
The restriction of K* to Hp leads to a surprising result. 
Theorem 14.14. i) For any a, 8 € Hp, we have k*(a,B) ER. 


it) K*: He x Hp > R is an inner product on HR. 
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This is indeed a surprise! Upon restriction to H3, instead of being weakened, the non- 
degeneracy of «* gets strengthened to positive definiteness. Now that we have a proper 
real inner product, we can define some familiar notions from basic linear algebra, such as 
lengths and angles. 


Definition. Let a, 6 ¢ Hg. Then, we define 
i) the length of a as |a| := \/K*(a, a); 


ii) the angle between a and 6 as y := cos ( 


“alt ) 


We need one final ingredient for our classification result. 


Definition. For any \ ¢ ® C Hg, define the linear map 


where 


The map s) is called a Weyl transformation and the set 
W := {s,|rA€ O} 
is a group under composition of maps, called the Weyl group. 


Theorem 14.15. i) The Weyl group W is generated by the fundamental roots in I, in 
the sense that for some1<n<r, with r = III, 


Vure W tatijecss Ty CAIL Ss Wi = Spi OS qy O89 O'S —* 


ii) Every root can be produced from a fundamental root by the action of W, i.e. 


VAE@G@:dreTl:dwew: A’\=w(r); 
iii) The Weyl group permutes the roots, that is, 
VAES:VwewWw: wir) €. 
14.4 Dynkin diagrams and the Cartan classification 


Consider, for any 7;,7; € I, the action of the Weyl transformation 


k* (i, 1; ) 
Sx; (75) gS on mA) aq: 
p) 


Since s;,(7j) € ® and ® C span, y (ID), for all 1 <i #j <r we must have 
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Definition. The Cartan matrix of a Lie algebra is the r x r matrix C with entries 


Gop” tet) 


rad cise 
where the C;; should not be confused with the structure constants Ce 5 


Theorem 14.16. To every simple finite-dimensional complex Lie algebra there corresponds 
a unique Cartan matrix and vice versa (up to relabelling of the basis elements). 


Of course, not every matrix can be a Cartan matrix. For instance, since C;; = 2 (no 
summation implied), the diagonal entries of C are all equal to 2, while the off-diagonal 
entries are either zero or negative. In general, Cj; # Ci, so the Cartan matrix is not 
symmetric, but if C;; = 0, then necessarily Cj; = 0. 

We have thus reduced the problem of classifying the simple finite-dimensional complex 
Lie algebras to that of finding all the Cartan matrices. This can, in turn, be reduced to the 


problem of determining all the inequivalent Dynkin diagrams. 
Definition. Given a Cartan matrix C, the 7j-th bond number is 
ng=CECH (no summation implied). 
Note that we have 


K* re Tj) K* (15, Ti) 
Gai Ra) 


=4 (Gua) 
[il 75| 


= 4 cos? Y, 


Nig = 


where is the angle between 7; and 7;. For i £ j, the angle ¢ is neither zero nor 180°, 
hence 0 < cos? y < 1, and therefore 


Nig € 10; 152,38): 


Since Ci; < 0 for i # j, the only possibilities are 


Cy Cy | naj 

0 0] 0 
—1 -l 1 
Tle Ca 2 
-1 -3 | 3 


Note that while the Cartan matrices are not symmetric, swapping any pair of Cj; and 
Cj, gives a Cartan matrix which represents the same Lie algebra as the original matrix, 
with two elements from the Cartan-Weyl basis swapped. This is why we have not included 
(—2,—1) and (—3, —1) in the table above. 

If nj; = 2 or 3, then the corresponding fundamental roots have different lengths, i.e. 


either |7;| < || or |7;| > |7j|. We also have the following result. 
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Proposition 14.17. The roots of a simple Lie algebra have, at most, two distinct lengths. 


The redundancy of the Cartan matrices highlighted above is nicely taken care of by 
considering Dynkin diagrams. 


Definition. A Dynkin diagram associated to a Cartan matrix is constructed as follows. 
1. Draw a circle for every fundamental root in 7; € I; 


O 


Ti 


2. Draw nj; lines between the circles representing the roots 7; and 7;; 


niz = 0 nig =1 Nig = 2 Nig =3 
Ti Ty Ty uy) TG uy) Ty ug) 


3. If nj; = 2 or 3, draw an arrow on the lines from the longer root to the shorter root. 


|r] > |r; mi] < |75| |r| > |7r5| |r| < |75| 
== 6 O eo=— 6) o==@ 
1 Tj 14 Tj Ty Tj Ti ug) 


Dynkin diagrams completely characterise any set of fundamental roots, from which we 
can reconstruct the entire root set by using the Weyl transformations. The root set can 
then be used to produce a Cartan-Weyl] basis. 


We are now finally ready to state the much awaited classification theorem. 


Theorem 14.18 (Killing, Cartan). Any simple finite-dimensional complex Lie algebra can 
be reconstructed from its set of fundamental roots I, which only come in the following forms. 


i) There are 4 infinite families 
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where the restrictions on n ensure that we don’t get repeated diagrams (the diagram D2 
is excluded since it is disconnected and does not correspond to a simple Lie algebra) 


ii) five exceptional cases 


E¢ O © LI © 
Eg OQ O | O © O © 


Fy Oo 


Go =O 


and no other. These are all the possible (connected) Dynkin diagrams. 


At last, we have achieved a classification of all simple finite-dimensional complex Lie al- 
gebras. The finite-dimensional semi-simple complex Lie algebras are direct sums of simple 
Lie algebras, and correspond to disconnected Dynkin diagrams whose connected compo- 


nents are the ones listed above. 
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15 The Lie group SL(2,C) and its Lie algebra sI(2, C) 


15.1 The structure of SL(2,C) 


Recall the the Lie group structure is the combination of a number of simpler structures, 
which we will now examine in detail for the special linear group of degree 2 over C, also 


known as the relativistic spin group. 


SL(2,C) as a set 
We define the following subset of C*:=C x CxCxC 


SL(2,C) := {(° ) ec! | ad — be = i}, 


where the array is just an alternative notation for a quadruple. 


SL(2,C) as a group 


We define an operation 


e: SL(2,C) x SL(2, C) > SL(2, C) 


(Ca) Ga (Ca) Ga) 


¢ i)* (< 4 = eas ones 
cd gh)” \ce+dg cf + dh] 


Formally, this operation is the same as matrix multiplication. We can check directly that 


where 


the result of applying e lands back in SL(2,C), or simply recall that the determinant of a 
product is the product of the determinants. Moreover, the operation e 


i) is associative (straightforward but tedious to check); 


ii) has an identity element, namely & ') € SL(2,C); 
= oe ab d —b 
iii) admits inverses: for each q € SL(2,C), we have € SL(2,C) and 
c —c a 
ab d —b d —b ab 10 
e = e = £ 
cd —c a =e. cd 01 


-1 
Hence, we have 6 #) = ( q a 
cd —c a 


Therefore, the pair (SL(2,C),e) is a (non-commutative) group. 
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SL(2,C) as a topological space 


Recall that if N is a subset of M and O is a topology on M, then we can equip N with the 
subset topology inherited from M 


Oln := {UN N|U € O}. 
We begin by establishing a topology on C as follows. Let 
B(2) ={yeC||z-9l <7} 
be the open ball of radius r > 0 and centre z € C. 


Im, 


Define Oc implicitly by 


UeOc & VzEU:4r>0:B,(z) CU. 
Then, the pair (C, Oc) is a topological space. In fact, we have 
(C, Oc) Stop (R*, Osta). 
We can then equip C4 with the product topology so that we can finally define 
O := (Oc)|sie,c); 


so that the pair (SL(2,C),O) is a topological space. In fact, it is a connected topological 


space, and we will need this property later on. 


SL(2,C) as a topological manifold 


Recall that a topological space (IV,O) is a complex topological manifold if each point 
p € M has an open neighbourhood U(p) which is homeomorphic to an open subset of C?. 
Equivalently, there must exist a C°-atlas, i.e. a collection ./ of charts (Ua, qa), where the 


U,, are open and cover M and each x is a homeomorphism onto a subset of C¢. 


Let U be the set 
yes {(° i) € SL(2,C) | azoh 
cd 


and define the map 
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where C* = C \ {0}. With a little work, one can show that U is an open subset of 
(SL(2, C),O) and x is a homeomorphism with inverse 


gi: 2«(U)oU 
a b 
(a,b,c) é iiae) 


However, since SL(2, C) contains elements with a = 0, the chart (U,x) does not cover the 


whole space, and hence we need at least one more chart. We thus define the set 


and the map 


y: Voa(V)CCxCxC 
a b 
b, d). 
(21) 9 (abd) 


Similarly to the above, V is open and y is a homeomorphism with inverse 


yt: a&(V)OV 


(a,b,d) ++ (atts 4): 
b 


An element of SL(2, C) cannot have both a and 6 equal to zero, for otherwise ad—bc = 0 £ 1. 
Hence Gop := {(U, x), (V, y)} is an atlas, and since every atlas is automatically a C°-atlas, 
the triple (SL(2,C), O, Hop) is a 3-dimensional, complex, topological manifold. 

SL(2,C) as a complex differentiable manifold 


Recall that to obtain a C!-differentiable manifold from a topological manifold with atlas 
of , we have to check that every transition map between charts in .& is differentiable in the 
usual sense. 

In our case, we have the atlas op := {(U, x), (V,y)}. We evaluate 


(yo27t)(a,b,0) =9((% ste )) = (tsb, HH) 


Hence we have the transition map 


yor t:2(UNV) > yUNV) 


(a, b,c) + (a,b, +2), 


Similarly, we have 
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Hence, the other transition map is 


roy t:y(UNV) > «(UNV) 


(a,b,c) + (a,b, adh). 


Since a £ 0 and b 4 0, the transition maps are complex differentiable (this is a good time 
to review your complex analysis!). 

Therefore, the atlas “op is a differentiable atlas. By defining < to be the maximal dif- 
ferentiable atlas containing “op, we have that (SL(2, C), O, ) is a 3-dimensional, complex 
differentiable manifold. 


SL(2,C) as a Lie group 


We equipped SL(2,C) with both a group and a manifold structure. In order to obtain a 
Lie group structure, we have to check that these two structures are compatible, that is, we 
have to show that the two maps 


uu: SL(2,C) x SL(2,C) + SL(2,C) 
CG ed Gy 
i: SL(2,C) + SL(2, C) 


woes 


are differentiable with respect to the differentiable structure on SL(2,C). For instance, for 


and 


the inverse map 7, we have to show that the map yoioz~! is differentiable in the usual for 
any pair of charts (U, xz), (V,y) € &. 


USO OC) —* — 7 CSc) 


es | 
a(U) Cc C3? —*“" __, wv) cc 
However, since SL(2,C) is connected, the differentiability of the transition maps in #& 
implies that if yoioa' is differentiable for any two given charts, then it is differentiable 


for all charts in. &. Hence, we can simply let (U, x) and (V, y) be the two charts on SL(2, C) 
defined above. Then, we have 


(yoroe™ Yah =(woi((S rire)) =) = CHE ha 


—C 


which is certainly complex differentiable as a map between open subsets of C? (recall that 


a#0on x(U)). 
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Checking that ys is complex differentiable is slightly more involved, since we first have 
to equip SL(2,C) x SL(2,C) with a suitable “product differentiable structure” and then 
proceed as above. Once that is done, we can finally conclude that ((SL(2,C),O, #),e) is 
a 3-dimensional complex Lie group. 


15.2 The Lie algebra of SL(2,C) 


Recall that to every Lie group G, there is an associated Lie algebra £(G), where 
L(G) = {X € TPE) |Vg,h eG: (6) Alp) =A outs 
which we then proved to be isomorphic to the Lie algebra T.G with Lie bracket 


[A, Bing := 57 (WA), i(Ble@) 


induced by the Lie bracket on £(G) via the isomorphism 7 
In the case of SL(2,C), the left translation map by (2 B) is 
uG b) : SL(2, C) => SL(2, C) 
cd 


Cie (eaeGa) 


By using the standard notation sI(2,C) = £(SL(2,C)), we have 


sl(2, C) =Lie alg T( 0) SL(2, C). 


1 
Ol 
We would now like to explicitly determine the Lie bracket on Ti 0) SL(2,C), and hence 
determine its structure constants. 

Recall that if (U,2) is a chart on a manifold M and p € U, then the chart (U,<z) 
induces a basis of the tangent space TM. We shall use our previously defined chart (U, x) 
on SL(2,C), where U := {(25) € SL(2,C) | a # 0} and 


Le U > a(U) CC 
& ? +> (a,b,c). 


Note that the d appearing here is completely redundant, since the membership condition 
of SL(2,C) forces d = nee However, we will keep writing the d to avoid having a fraction 
in a matrix in a subscript. 


The chart (U,x) contains (} 9) and hence we get an induced co-ordinate basis 


(ae) M59) S20) |1<i<3} 


01 
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so that any A € Ti 10) SL(2,C) can be written as 
Ol 


4-0 der) gg)? (Bet) got” Cae) 


1 


’ 
10 
01) 
for some a,f,y € C. Since the Lie bracket is bilinear, its action on these basis vectors 


uniquely extends to the whole of ie 0) SL(2,C) by linear continuation. Hence, we sim- 


ply have to determine the action of the Lie bracket of s{(2,C) on the images under the 
isomorphism j of these basis vectors. 


Let us now determine the image of these co-ordinate induced basis elements under the 


i((2) yy) 209 


is a left-invariant vector field on SL(2,C). It assigns to each point (2 :) €U C SL(2,C) 


the tangent vector 
») (2b) (2%), €) 


isomorphism j. The object 


(ze) i” 


This tangent vector is a C-linear map C°(SL(2,C)) —> C, where C%(SL(2,C)) is the 
C-vector space (in fact, the C-algebra) of smooth complex-valued functions on SL(2, C) 


T SL(2,C). 
rage 


although, to be precise, since we are working in a chart we should only consider functions 
defined on U. For (the restriction to U of) any f € C*°(SL(2,C)) we have, explicitly, 


O O 
(a), er as Ge) a (Yeas) 
= (Fol 4) ea \ ie), 


where the argument of 0; in the last line is a map x(U) C C? = C, hence 0; is simply the 
operation of complex differentiation with respect to the i-th (out of the 3) complex variable 
of the map f o¢/,,\ 0x7", which is then to be evaluated at z(49) € C®. By inserting an 


Cc 
identity in the composition, we have 


= 0:((fo2') 0 (olan) a eC). 


where fox —!: 2(U) C C3 > C and (rol, ,\ 0x71): e(U) C C3 — x(U) C C3 and hence, 


Cc 
we can use the multi-dimensional chain rule to obtain 


= (Am(F 02 )((e ora 4) 02 V(©(39)))) (Al@™ lea yy 02 V(0(39))), 


— 135 — 


with the summation going from m = 1 to m = 3. The first factor is simply 


Om (Fea 'V((wo£ra oy) (52) Om( fo 2-")(a (28)) 
6) 
Se Naa (f). 
(a: Jey 


To see what the second factor is, we first consider the map x” o “(a b) oy 
cd 


—! This map 


acts on the triple (e, f,g) € #(U) as 


_ ym ae + bg ie METS) ), 
ce+dg cf t+ aero) 


and since 2” := proj, 0 2, with m € {1, 2,3}, we have 


(e™oL(0 i) oa )(e, f,g) = projm(ae + bg, af + E+! ce + dg), 
cd 


the map proj,,, simply picks the m-th component of the triple. We now have to apply 0; to 
this map, with i € {1,2,3}, ie. we have to differentiate with respect to each of the three 
complex variables e, f, and g. We can write the result as 


Os(a™ ° “(a ) o a Nhe, f, 9) — Dee, gag). es 


cd 


where m labels the rows and 7 the columns of the matrix 


a 0 b 
D(e, f.g) = | OHO a+ Y 
a 0 d 


Finally, by evaluating this at (e, f,g) =2($9) = (1,0,0), we obtain 


cd 
where, by recalling that d = itbe 
a0 b 
D:= D(1,0,0)=]-ba 0 
c 0 1+be 
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Since this holds for an arbitrary f € C°(SL(2,C)), we have 


i((ga), ») (2) ie (2), (aa), 0) =2"s (a5 }rox) 


O1 O1 


and since the point (@ 5) € U CSL(2,C) is also arbitrary, we have 


i((Z) yy) <P aoe eae 


where D is now the corresponding matrix of co-ordinate functions 


Note that while the three vector fields 


gC) 4 TSL(2,C) 


or” 
Ci) lary 


are not individually left-invariant, their linear combination with coefficients D™, is indeed 
left-invariant. Recall that these vector fields 


i) are C-linear maps 


C°(SL(2,C)) 2 C°(SL(2,C)) 
x 
f > On(f oa") oa; 
ii) satisfy the Leibniz rule 
O 0 0 
Zum Fg) = oa (9) 4 95m (t); 


iii) act on the coordinate functions x’ € C°(SL(2,C)) as 
as 
our 


since the composition of a constant function with any composable function is just the 


(a) = On(xt' oz) or =Om(proj,ozon jor =, o2 = 61, 


constant function. 


Hence, we have an expansion of the images of the basis of Ti 10) SL(2,C) under 7: 
O01 


We now have to calculate the bracket (in s!(2,C)) of every pair of these. We can also do 
them all at once, which is a good exercise in index gymnastics. We have 


(C25) joy) (C208) go) ] = PP ae Paes] 


Letting this act on an arbitrary f € C°(SL(2,C)), by definition 


lp. 5 p" 2] = open (sane (1)) p" e Oe). 


 @qm?~ *& Agn * Agm 


The first term gives 


D™. = (p" —(f) =D", ao (D, On(f oa") ox) 


* Oxm k agn 
=D", 2 (D",) (On F027) 02) + D:D", aoa (On(f 0.21) 02) 
=D", 57 -(D",) (On( foe!) 02) + D™.D, On(On(fo2™!) owen!) ow 
=D", 59 -(D",) (On( foe!) 0.2) + D:D", OnOn( fF 0071) 0 


Similarly, we have 


mr 0 mm 0 nm 0 m = n m — 
Ds, (Di sou (f)) = De Gop (D™s) (Om(f 0271) 02) + D",D"; InOm(f 0271) 02 


Hence, recalling that OO, = O,0m by Schwarz’s theorem, we have 


@ a a —- 
LD”, soa D's Goa] (1) = D's Gow D's) Onl f 02-4) 04) + DD" DnOu( FOE) 0 0 
a ee 
— D", yn (Ds) (Om(f 0 27") 0 a) — D",D™, OnOntf ox”) or 
6) 0 
= (Ds so (D's) — Ds Gaga (Di) Onl f 02-4) 08 


where we relabelled some dummy indices. Since the f € C°(SL(2,C)) was arbitrary, 


= (air 


* @gm?~ & Agn * Axm k Ogm Ox” 


We can now evaluate this explicitly. For 7 = 1 and k = 2, we have 


Or 25:00 nO Me CO a OO 
igor Dagan (Deze) — Dp (Ph) a 


Pe 


m 0 2 m 0 2 0 
(D": 5a (D*2) — D De —(D ) aga 
3 ae 
(DeaggmtP")-P "sol 1) 0x3 
0 
= Diy t (Di 4 D*,)a 3 25,3 
) 
= 2a! as. 
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Similarly, we compute 


us gor Dagon = (Dg a (D's) — Dag ee (O'1)) aa 


(D a Ds) —D “ag dag 
m 3 m 3 eee 
(D1 5m (D's) — D5 (D4) 53 
fe) 23. O 
2 l+ox-x 
ae Ox! ers ag 


and 


DMG _m D oe | = (DT aga (P's) — Daze) oa 


a) a) 
(D 25,m\D 3) D axr™ 2) Ax3 
a) a) a) 
= (D?, - et ia D355 Da 
a) a) 
= 2 ee: 
~ Oat Ba2 7% Ba3? 


where the differentiation rules that we have used come from the definition of the vector 
field . the Leibniz rule, and the action on co-ordinate functions. 

By applying j~!, which is just evaluation at the identity, to these vector fields, we 
finally see that the induced Lie bracket on Ti 10) SL(2,C) satisfies 


(ac), »y' (a) J =2(sa), 
(G) 49) Can > 7GA)ay 
(az), 0) (az), J : (sx), 


Hence, the structure constants of Ti 10) SL(2,C) with respect to the co-ordinate basis are 
O01 


Cae C33 = —2, CaaS, 


with all other being either zero or related to these by anti-symmetry. 
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16 Dynkin diagrams from Lie algebras, and vice versa 


16.1 The simplicity of sl(2,C) 


We have seen that the non-zero structure constants of qT 10) SL(2,C) are 
Ol 


C1 = 2, C*13 = —2, Chg = 1, 
plus those related by anti-symmetry. 


Proposition 16.1. Two Lie algebras A and B are isomorphic if, and only if, there exists 
a basis of A and a basis of B in which the structure constants of A and B are the same. 


Since we have already proved that TeG =ticalg £(G) for any Lie group G, we can 
deduce the existence of a basis {X1, X2, X2} of sl(2,C) with respect to which the structure 
constants are those listed above. In other words, we have 


[X1, X92] — 2X2, 
[X1, X3] = —2X3, 
[X2, X3] _ Xj. 


In this basis, the Killing form of sl(2,C) has components 
Rig = Cre sas 
with all indices ranging from 1 to 3. Explicitly, we have 


A11L >= Cie Clie 
= Cp O 1 + CnC 2 + C7nC"13 
= 07420719 + C33C* 13 
= 8. 


Since «& is symmetric, we only need to determine «;; for 7 < 7. By writing the components 
in a 3 x 3 array, we find 
8 0 0 
[Kij] = |0-8 0], 
00 8 


which is just shorthand for 

R(Xay ey) = 8, K(X2, X2) = —8, K(X3, X3) = 8, 
and «(X;, Xj) = 0 whenever i # j. 
Proposition 16.2. The Lie algebra sl(2,C) is semi-simple. 


Proof. Since the diagonal entries of « are all non-zero, the Killing form is non-degenerate. 


By Cartan’s criterion, this implies that sl(2,C) is semi-simple. 
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Remark 16.3. There is one more thing that can be read off from the components of k, 
namely, that it is an indefinite form, i.e. the sign of «K(X, X) can be positive or negative 
depending on which X € sl(2,C) we pick. 

A result from Lie theory states that the Killing form on the Lie algebra of a compact 
Lie group is always negative semi-definite, i.e. K(X, X) is always negative or zero, for all X 
in the Lie algebra. Hence, we can conclude that SL(2,C) is not a compact Lie group. 


In fact, s{(2,C) is more than just semi-simple. 
Proposition 16.4. The Lie algebra sl(2,C) is simple. 


Recall that a Lie algebra is said to be simple if it contains no non-trivial ideals, and 
that an ideal I of a Lie algebra L is a Lie subalgebra of L such that 


Vael:VyeL: [z,ylel. 
Proof. Consider the ideal of sl(2,C) 
I := {aX1 + BX2 + 7X3 | a, 6,7 restricted so that I is an ideal}. 


Since the bracket is bilinear, it suffices to check the result of bracketing an arbitrary element 
of I with each of the basis vectors of s{(2,C). We find 


[aX + BX2+7X3,.X1] = —28.X) + 27Xz3, 
[aX] BXo yX3, X9] = 2aX9 = yX1, 
[aX, + BX2 + 7X3, X3] = —2aX3 + 6X4. 


We need to choose a,(,y so that the results always land back in J. Of course, we can 
choose a, 6,y € C and a = 8 = y = 0, which correspond respectively to the trivial ideals 
sl(2,C) and 0. If none of a, 6, y is zero, then you can check that the right hand sides above 
are linearly independent, so that J contains three linearly independent vectors. Since the 
only n-dimensional subspace of an n-dimensional vector space is the vector space itself, we 
have I = L. Thus, we are left with the following cases: 


i) if a =0, then I C spanc({X2, X3}) and hence we must have 8 = y = 0 as well; 


ii) if 6 = 0, then J C spanc({X1, X3}), hence we must have a = 0, so that in fact 
I C spanc({X3}), and hence y = 0 as well; 


iii) if y = 0, then J C spanc({X1, X2}), hence we must have a = 0, so that in fact 
I C spanc({X2}), and hence 6 = 0 as well. 


In all cases, we have J = 0. Therefore, there are no non-trivial ideals of s{(2,C). 
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16.2 The roots and Dynkin diagram of sl(2,C) 
By observing the bracket relations of the basis elements of sl(2,C), we can see that 
HT := spanc({X1}) 


is a Cartan subalgebra of sl(2,C). Indeed, for any h € H, there exists a € € C such that 
h = €X1, and hence we have 


ad(h)X2 = €[X1, X2] = 2€Xo, 
ad(h)X3 = €[X1, X3] = —2€X3. 


Recall that in the section on Lie algebras, we re-interpreted these eigenvalue equations in 
terms of functionals Ag, A3 € H* 


dQ: A =, C A3: A =, C 
EX, +> 26, EX, +> —2€ 
whereby 
ad(h)X» = A2(h) Xa, 
ad(h)X3 = A3(h) X3. 


Then, Ag and A3 are called the roots of sl(2,C), so that the root set is ® = {A2,A3}. Of 
course, we are mainly interested in a subset II C ® of fundamental roots, which satisfies 


i) IL is a linearly independent subset of H*; 
ii) for any A € ®, we have \ € span, (II). 


We can choose II := {A2}, even though II := {A3} would work just as well. Since |II| = 1, 
the Weyl group is generated by the single Weyl transformation 


85: AR — Hp 


ae 


Recall that we can recover the entire root set ® by acting on the fundamental roots with 
Weyl transformations. Indeed, we have 


K* (Ag, Az) 


ae o es 
8y2(Az2) 2 fi Oa) 


HN =a, 


as expected. Since there is only one fundamental root, the Cartan matrix is actually just a 
1 x 1 matrix. Its only entry is a diagonal entry, and since sl(2,C) is simple, we have 


C= 2h 
The Dynkin diagram of sl(2,C) is simply 
O 


Hence, with reference to the Cartan classification, we have A; = sl(2,C). 
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16.3. Reconstruction of Aj from its Dynkin diagram 


We have seen an example of how to construct the Dynkin diagram of a Lie algebra, albeit 
the simplest of this kind. Let us now consider the opposite direction. We will start from 
the Dynkin diagram 


San) 


We immediately see that we have two fundamental roots, i.e. Il = {7,72}, since there are 
two circles in the diagram. The bond number is nj2 = 1, so the two fundamental roots 
have the same length. Moreover, by definition 


1= nig = CieCo1 


and since the off-diagonal entries of the Cartan matrix are non-positive integers, the only 
possibility is Cy2 = Co, = —1, so that we have 


2 -1 
C= . 
—1 2 
To determine the angle y between 7, and 7, recall that 


se 4 cos? Y, 


and hence | cos y| = 5: There are two solutions, namely y = 60° and y = 120°. 


COS @ 


By definition, we have 


kK * (71, 72) 


cosy = —~—"—, 
|71| |r| 


and therefore 
K (11,72) _ ust |72|cosp _ 9/2 
K* (71, 77) K* (771,71) |71| 


0> C2 =2 cos Y. 


It follows that cosy < 0, and hence y = 120°. We can thus plot the two fundamental roots 
in a plane as follows. 
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72 


TM 


We can determine all the other roots in ® by repeated action of the Weyl group. For 
instance, we easily find that s,,(71) = —7 and s,,(72) = —72. We also have 


K* (71, 72) 1 
7 = 1 — 2(—5) 1 = 71 4+ To. 
eer 1) (—3) 


Finally, we have $7,47,(71 +72) = —(m1 +72). Any further action by Weyl transformations 


simply permutes these roots. Hence, we have 


® = {m1 —11, 72, —72, 71 + M2, —(m1 + 72) } 
and these are all the roots. 


72 71 + 72 


TY TT 


—(m1 + 72) ihe 


Since H* = spanc¢(II), we have dim H* = 2, thus the dimension of the Cartan subalgebra 
is also 2. Since |®| = 6, we know that any Cartan-Wey] basis of the Lie algebra Ag must 
have 2+ 6 = 8 elements. Hence, the dimension of Ag is 8. 

To complete our reconstruction of Ag, we would now like to understand how its bracket 
behaves. This amounts to finding its structure constants. Note that since dim Ag = 8, the 
structure constants Ct. ; consist of 8° = 512 complex numbers (not all unrelated, of course). 

Denote by {h1, he, e3,...,eg} a Cartan-Weyl basis of Ag, so that H = spanc({h1, h2}) 
and the €, are eigenvectors of every h € H. Since Ag is simple, H is abelian and hence 


(hi, hol] = 0 => Cho = Ch, =0, VI<k<8. 
To each eg, for 3 < a < 8, there is an associated A, € ® such that 


VheH: ad(h)eg = Aal(h)ea.- 
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In particular, for the basis elements hy, ha, 


(hi, Eq] oa ad(h1)ea = Aalhijea: 
(ha, Ca] = ad(ha)ea = Aalha)ea, 


so that we have 


C"be = Gs, = 0, oe = a(ha), V3<a<8. 
Finally, we need to determine [eq, eg]. By using the Jacobi identity, we have 


[his lea, ea]] = —[ea, leg, hil] — leg, [Ais ea] 
= —[ea, —Ag(hi)eg] — [es Aalha)ea] 
= AB(hi) lea, es] + Aa(hi) lea, €] 
= (alhi) + Ag(hi)) lea, €8], 
that is, 
ad(hi) lea, ea] = (Aa(hi) + Ag(hi))[ea, eg]. 


If Ay + Ag € ®, we have [eq, eg] = ey for some 3 < y < 8 and € € C. Let us label the 
roots in our previous plot as 


Then, for example 


ad(h)[e3, ea] = (m7 + 72)(h)[es, ea], 


and hence [e3,e4] is an eigenvector of ad(h) with eigenvalues (71 + 72)(h). But so is es! 
Hence, we must have [e3, e4] = €e5 for some € € C. Similarly, |e5,e7] = §e3, and so on. 

If Ag + Ag ¢ ®, then in order for the equation above to hold, we must have either 
[ea, eg] = 0 (so both sides are zero), or Ag(h) + Ag(h) = O for all h, ie. Ag + Ag = 0 as 
a functional. In the latter case, we must have [eg,e3] € H. This follows from a stronger 
version of the maximality property of the Cartan subalgebra H of a simple Lie algebra J, 
namely that 

(VheEH: [h,z] =0) > ve H. 

Summarising, we have 
fe, ifAg +Age® 
lea,eg]= 4 EH ifdr\y+Ag=0 


0 otherwise 


and these relations con be used to determine the remaining structure constants of Ag. 
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17 Representation theory of Lie groups and Lie algebras 


Lie groups and Lie algebras are used in physics mostly in terms of what are called rep- 
resentations. Very often they are even defined in terms of their concrete representations. 
We took a more abstract approach by defining a Lie group as a smooth manifold with a 
compatible group structure, and its associated Lie algebra as the space of left-invariant 
vector fields, which we then showed to be isomorphic to the tangent space at the identity. 


17.1 Representations of Lie algebras 

Definition. Let LD be a Lie algebra. A representation of L is a Lie algebra homomorphism 
p: L ~ End(V), 

where V is some finite-dimensional vector space over the same field as L. 


Recall that a linear map p: L — End(V) is a Lie algebra homomorphism if 


Va,yeL: p(la, yl) = [e(x), py] = ple) © ply) — ely) 2 pl); 
where the right hand side is the natural Lie bracket on End(V). 
Definition. Let p: L —> End(V) be a representation of L. 

i) The vector space V is called the representation space of p. 
ii) The dimension of the representation p is dimV. 


Example 17.1. Consider the Lie algebra sl(2,C). We constructed a basis {X1, X2, X3} 
satisfying the relations 


[X1, X92] = 2X2, 
[X1, X3] >= —2X3, 
[X2, X3] = Xj. 


Let p: sl(2,C) — End(C?) be the linear map defined by 


p(X) = € ol p\Xa) é a ee c i) 


(recall that a linear map is completely determined by its action on a basis, by linear con- 
tinuation). To check that p is a representation of sI(2,C), we calculate 


[e(X1), e(X2)] = é :) ‘ 7 G ; (; ) 
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Similarly, we find 


By linear continuation, p([z,y]) = [p(x), p(y)] for any x,y © sl(2,C) and hence, p is a 
2-dimensional representation of st(2,C) with representation space C?. Note that we have 


im,(st(2,C)) = ie ) € End(C?) | a+d= of 


= {¢ € End(C”) | tr¢ = 0}. 


This is how sl(2,C) is often defined in physics courses, i.e. as the algebra of 2 x 2 complex 
traceless matrices. 


Two representations of a Lie algebra can be related in the following sense. 


Definition. Let L be a Lie algebra and let 
pi: L — End(Vj), p2: L ~ End(V2) 
be representations of L. A linear map f: Vi; — Vo is a homomorphism of representations if 
VaeL: fopi(x) = pox) of. 


Equivalently, if the following diagram commutes for all x € L. 
Vi —+~— Vo 
p(x) p2(x) 
Vi —*+— Vp 


If in addition f: V,; > V2 is a linear isomorphism, then f~!: V2 — Vj is automatically 
a homomorphism of representations, since 


fopi(t)=prlx)of & fto(fopi(x))of' = fi! 0 (p(x) o f)of 
& pi(x)o f-! = fo! o pa(a). 


Definition. An isomorphism of representations of Lie algebras is a bijective homomorphism 


of representations. 


Isomorphic representations necessarily have the same dimension. 


Example 17.2. Consider s0(3,R), the Lie algebra of the rotation group SO(3,R). It is a 
3-dimensional Lie algebra over R. It has a basis {J1, J2, J3} satisfying 


Ji, Jj] = CN Sky 
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where the structure constants Ge are defined by first “pulling the index & down” using the 


j 
Killing form Kg, = C™,,,C”,,,, to obtain Cri; := Kem"; and then setting 


1 if (7k) is an even permutation of (123) 
Chij = €igk = 4-1 if (¢7k) is an odd permutation of (123) 


0 otherwise. 


By evaluating these, we find 


[Ji, Jo] = Js, 
[J2, J3] = Ji, 
[J3, Ji] = Jo. 


Define a linear map Pyec: $0(3, R) — End(R?) by 


00 0 001 0-10 
Prvec(Ji) = | 00-1], Pvec(J2) := | 0 00], Peclds\2= |.4°0: 0 
010 -100 000 


You can easily check that this is a representation of s0(3, RR). However, as you may be aware 
from quantum mechanics, there is another representation of s0(3,R), namely 


Pspin: 50(3, RR) —> End(C?), 
with C? understood as a 4-dimensional R-vector space, defined by 
i 


i i 
Pspin(J1) := aiae Pspin(J2) = 5 O2: Pspin(J3) = 5 73) 


where 01,02,03 are the Pauli matrices 


{04 ee a fb 0 
BETIS EUR GGNIE ee Orel: 
You can again check that this is a representation of s0(3,R). Since 
dimR® = 3 44=dimC’, 


the representations Pyec and Pspin are not isomorphic. 


Any (non-abelian) Lie algebra always has at least two special representations. 


Definition. Let L be a Lie algebra. A trivial representation of L is defined by 


Piry: L — End(V) 


ur> Ptrv(2) = 0, 


where 0 denotes the trivial endomorphism on V. 
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Definition. The adjoint representation of L is 


These are indeed representations since we have already shown that ad is a Lie algebra 
homomorphism, while for the trivial representations we have 


Va,yeL: prrv([x, yl) = 0 = [pev(x), perv(y)]- 
Definition. A representation p: L — End(V) is called faithful if p is injective, i.e. 
dim(im,(L)) = dim L. 


Example 17.3. All representations considered so far are faithful, except for the trivial rep- 
resentations whenever the Lie algebra L is not itself trivial. Consider, for instance, the 


adjoint representation. We have 


ad(z) = ad(y) @ Vz € L: ad(x)z = ad(y)z 
eV 26 Ly? (tz) = |g; 2 
&VzeEL:|x—y,z] =0. 


If L is trivial, then any representation is faithful. Otherwise, there is some non-zero z € L, 
hence we must have x — y = 0, so x = y, and thus ad is injective. 


Definition. Given two representations p1: L —> End(Vi), p2: L —> End(V2), we can 
construct new representations called 


i) the direct sum representation 
p1 ® po: L —> End(V,; © V2) 
x ++ (p1 ® p2)(&) := pi(@) ® pa(2) 
ii) the tensor product representation 
Pp. ® po: L > End(V; x V2) 


x ++ (p1 ® p2)(x) = p1(z) @ idy, + idy, @p2(z). 


Example 17.4. The direct sum representation Pyec © Pspin: $0(3, R) ~, End(R? @ C?) given 
in block-matrix form by 


Pye Pvec(x)| 0 
(Pvee © Pspin)( ) ( 0 =o 


is a 7-dimensional representation of 50(3, R). 
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Definition. A representation p: L — End(V) is called reducible if there exists a non-trivial 
vector subspace U C V which is invariant under the action of p, i.e. 


VaeL:VueU: plxjuev. 


In other words, p restricts to a representation p|y: L —> End(U). 
Definition. A representation is irreducible if it is not reducible. 


Example 17.5. i) The representation Pyec ® Pspin: $0(3, R) —; End(R? 6 C?) is reducible 
since, for example, we have a subspace R? @ 0 such that 


Va €$0(3,R):VuER®@0: (prec ® Pspin)(x)u € R° 60. 


ii) The representations pyec and pspin are both irreducible. 


Remark 17.6. Just like the simple Lie algebras are the building blocks of all semi-simple Lie 
algebras, the irreducible representations of a semi-simple Lie algebra are the building blocks 
of all finite-dimensional representations of the Lie algebra. Any such representation con be 
decomposed as the direct sum of irreducible representations, which can then be classified 
according to their so-called highest weights. 


17.2. The Casimir operator 


To every representation p of a compact Lie algebra (i.e. the Lie algebra of a compact Lie 
group) there is associated an operator Q,, called the Casimir operator. We will need some 
preparation in order to define it. 


Definition. Let p: L — End(V) be a representation of a complex Lie algebra L. We 
define the p-Killing form on L as 


Kp: Lx L iC 
(x,y) > Kp(x,y) = tr(p(z) © p(y). 


Of course, the Killing form we have considered so far is just Kaq. Similarly to kag, every 
Kp is symmetric and associative with respect to the Lie bracket of L. 


Proposition 17.7. Let p: L — End(V) be a faithful representation of a complex semi- 
simple Lie algebra L. Then, kp is non-degenerate. 


Hence, Kp» induces an isomorphism L — L* via 


Lae hilt) e kr: 


Recall that if {X1,..., Xaim} is a basis of LZ, then the dual basis {X}, he , Xdim Ly of L* 
is defined by 
X'(X5) = 6). 
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By using the isomorphism induced by ky, we can find some &,...,€dimz € L such that we 
have «(£;,—) = X* or, equivalently, 
V@e Lt Ko(a,&) = X‘ (2). 
We thus have 
Proposition 17.8. Let {X;} and {&;} be defined as above. Then 


dim L 


Roel SC oe 
m=1 


where Ch ; are the structure constants with respect to {X;}. 


Proof. By using the associativity of k,, we have 


tip Xa Xge Gal) = hp (Ail) SO yep eniee) — Cok = Oye 


But we also have 


dim L dim L dim L 


Ko(Xi, S CK niém) = S C* ng hp(Xi, Em) = SS C¥ ng Sim = C5. 
m=1 m 


Therefore 
dim L 


V1<i<dimL: Fp Xis [Xj El =e Ci ngSm) = 
m=1 


and hence, the result follows from the non-degeneracy of Kp. 


We are now ready to define the Casimir operator and prove the subsequent theorem. 


Definition. Let p: L —> End(V) be a faithful representation of a complex (compact) Lie 
algebra L and let {X1,..., Xgimz} be a basis of L. The Casimir operator associated to the 
representation p is the endomorphism 2,: V 3V 


dim L 
% = > (Xi) o pl&). 
i=l 
Theorem 17.9. Let Q, the Casimir operator of a representation p: L ~, End(V). Then 
VeeEL: [Mp,ple)] =0, 


that is, Q, commutes with every endomorphism in im,(L). 
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Proof. Note that the bracket above is that on End(V). Let 2 = 2*X; € L. Then 


dim L 


[psa] = | > 6X) 0 (6), obehX.) 
1=1 
dim L 


= So x*[p(Xi) © p(&i), o(Xx)]. 
ik=1 
Observe that if the Lie bracket as the commutator with respect to an associative product, 
as is the case for End(V), we have 


[AB,C] = ABC-—CBA 
= ABC —-CBA-— ACB+ ACB 
= A[B,C]+ [A, C]B. 


Hence, by applying this, we obtain 


dim L dim L 
> x*[p(Xi) 0 p(&), p(Xx)] = s x* (p(Xi) © [p(E:), e(Xx)] + [o( Xi), p(Xx)] © p(E)) 
i,k=1 i,k=1 
dim L 
= S> a (0(X,) 0 p([éi, Xe) + o(LXi, Xu) © o(,)) 

ik= 

a) 

S~ 2*(0(Xi) 0 p(—Cingém) + o(C™,Xm) © p(E)) 
i,kym=1 

dim L 
= S* ak (—C*40(Xi) © o(Em) + OC" ,0(Xm) © o(E)) 


i,kym=1 


= S> ak (—Ct40(X:) © o(Em) + Chne(Xi) © o(Em)) 


1,kym=1 


=0, 


where we have swapped the dummy summation indices 7 and m in the second term. 


Lemma 17.10 (Schur). Jf p: L — End(V) is irreducible, then any operator S which 
commutes with every endomorphism in im,(L) has the form 


S = cpidy 
for some constant cp € C (or R, if L is a real Lie algebra). 
It follows immediately that 0, = c,idy for some c, but, in fact, we can say more. 


Proposition 17.11. The Casimir operator of p: L ~> End(V) is Q, = cpidy, where 


_ dim L 
cer 
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Proof. We have 
tr(Q,) = tr(cpidy) = cp dim V 


and 


dim L 
tx(Mp) = tr( D> (26) 0 6) 
t= 1 

dim L 


= Dy tr(e( Xi) © p(&)) 
ce 


= Do ko Xi, &) 
ae 

= S> Oia 
4=1 


= dim L, 


which is what we wanted. 


Example 17.12. Consider the Lie algebra s0(3,R) with basis {J1, Jo, J3} satisfying 
Ldidal =sopdes 


where we assume the summation convention on the lower index k. Recall that the repre- 
sentation Pyec: $0(3,R) — End(R°) is defined by 


00 0 001 ao 
Prvec(Ji) = | 00-1], Pvec(J2) = | 0 00], Petia 7 = (1 O50 
01 0 —-100 0 0 0 


Let us first evaluate the components of Kp,,.. We have 


(Kpvec) 11 = Ose Ji) a tr(Pvec(J1) ° Puec( It) 

= tr((Pvec(J1))*) 

000\ 
=tr}00-1 

01 0 

00 O 
=tr|0-1 0 

00 -1 


After calculating the other components similarly, we find 


—2 0 0 
[(Kpvec ig] = 0 —2 0 
0 O -2 
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Thus, Kp,..(Ji, |) = ij requires that we define ; := —$J;. Then, we have 


3 
Qovec = Ss" Doec(d;) ° Pe) 
i=1 


II 
> 
< 
ey 
cy 
a, 
SS 
se 

[e) 
> 
3 
is 
<5 

NI 
S 
So 


= 2 2 2 
: [ie 001 oo 
Se | (OOHL| so) O00) +100 
010 2£09 000 
; (- 0 -10 0 =f 10:0 
SS | | Os: 4 | 0.00» cei) 0-170 
Or s0r sad 0 01 0 00 
100 
= 1010 
001 


Hence Q4,.6 = Cpyec Idgs with cy,,, = 1, which agrees with our previous theorem since 


dimso(3,R) 3 


dim R? 3 


Example 17.13. Let us consider the Lie algebra $0(3, R) again, but this time with represen- 
tation Pspin. Recall that this is given by 
i i i 
Pepin) = ~5 O1; Papin(J2) = 5 02; Pspin(J3) = =) 03, 
where 01, 02,03 are the Pauli matrices. Recalling that 0? = 03 = o3 = idc2, we calculate 


(te id = Ripssin (Ls J) = r(Pspin(J1) ° Pspin(J1)) 
( 


t 
= tr (Pspin(Ji))”) 
: 


Note that tr(idc2) = 4, since tr(idy ) = dim V and here C? is considered as a 4-dimensional 


vector space over R. Proceeding similarly, we find that the components of Kp... are 
-1 0 0 


[(Kpspin iz] = 0 —1 0 
0 0-1 


— 154 — 


Hence, we define €; := —J;. Then, we have 
3 
Qoepin = s pahintd:) C Papin(€s) 
i=l 


3 
= Ss" pxpiat da) ° Poly) 
i=1 


3 


== > (espin(i))” 


in accordance with the fact that 


dimso(3,R) 3 


dimC2. 4’ 


17.3. Representations of Lie groups 


We now turn to representations of Lie groups. Given a vector space V, recall that the 
subset of End(V) consisting of the invertible endomorphisms and denoted 


GL(V) = Aut(V) := {¢ € End(V) | det ¢ 4 0}, 


forms a group under composition, called the automorphism group (or general linear group) 
of V. Moreover, if V is a finite-dimensional K-vector space, then V vec K“™Y and hence 


the group GL(V) can be given the structure of a Lie group via 
GL(V) =tiegrp GL(K2™") := GL(dim V, K). 


This is, of course, if we have established a topology and a differentiable structure on K%, 
as is the case for R@ and C2. 


Definition. A representation of a Lie group (G,e) is a Lie group homomorphism 
R: G—> GL(V) 
for some finite-dimensional vector space V. 
Recall that R: G — GL(V) is a Lie group homomorphism if it is smooth and 
Vq1,92 € G: Rig ¢ ge) = R(gi) o R(g2). 
Note that, as is the case with any group homomorphism, we have 


R(e) = idy and Re) =]R@)*: 
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Example 17.14. Consider the Lie group SO(2,R). As a smooth manifold, SO(2,R) is iso- 
morphic to the circle S'. Let U = S! \ {po}, where p is any point of $1, so that we can 
define a chart 6: U > [0,27) C R on S! by mapping each point in U to an “angle” in 
[0, 277). 


The operation 
Pi © p2 = (A(p1) + A(p2)) mod 2r 


endows S$! “gig SO(2,R) with the structure of a Lie group. Then, a representation of 
SO(2,R) is given by 


R: SO(2,R) + GL(R?) 
a ( cos 6(p) sei 


— sin 0(p) cos 0(p) 


Indeed, the addition formulzefor sine and cosine imply that 


R(pi © p2) = R(pi) o R(p2). 


Example 17.15. Let G be a Lie group (we suppress the e in this example). For each g € G, 
define the Adjoint map 


Adj:G—-G 
hw ghgt. 


Note the capital “A” to distinguish this from the adjoint map on Lie algebras. Since Ady is a 
composition of the Lie group multiplication and inverse map, it is asmooth map. Moreover, 


we have 
1 


Ad,(e) = geg"' = gg! =e. 
Hence, the push-forward of Ad, at the identity is the map 
(Adg, Je: TeG — Tra, (e)G = TeG. 
Thus, we have Ady € End(7T.G). In fact, you can check that 
(Ad,-1, Je © (Adg, Je = (Ady, Je 9 (Adg-1, Je = idra, 


and hence we have, in particular, Ad, € GL(7T-G) =niegrp GL(L(G)). 
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We can therefore construct a map 


Ad: G +> GL(T.G) 
gt> Adg, 


which, as you can check, is a representation of G on its Lie algebra. 


Remark 17.16. Since a representation R of a Lie group G is required to be smooth, we can 
always consider its differential or push-forward at the identity 


(R)e: TeG > Ta, GL(V). 
Since for any A, B € T.G we have 
(Rx )e[A, B] = [(ReJeA, (ReJeB], 


the map (R,)e is a representation of the Lie algebra of G on the vector space GL(V). In 
fact, in the previous example we have 


(Ady )e = ad, 


where ad is the adjoint representation of T.G. 
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18 Reconstruction of a Lie group from its algebra 


We have seen in detail how to construct a Lie algebra from a given Lie group. We would 
now like to consider the inverse question, i.e. whether, given a Lie algebra, we can construct 
a Lie group whose associated Lie algebra is the given one and, if this is the case, whether 
this correspondence is bijective. 


18.1 Integral curves 


Definition. Let M be a smooth manifold and let Y € (7M). An integral curve of Y is 
smooth curve 7: (—e,e) > M, with ¢ > 0, such that 


VAE (—e,€) : Xy (2) = Y aoe 


It follows from the local existence and uniqueness of solutions to ordinary differential 
equations that, given any Y € [(7M) and any p € M, there exist ¢ > 0 and a smooth 
curve y: (—e,€) + M with (0) = p which is an integral curve of Y. 

Moreover, integral curves are locally unique. By this we mean that if 7, and 7 are 
both integral curves of Y through p, i.e. 71 (0) = y2(0) = p, then 7 = y2 on the intersection 
of their domains of definition. We can get genuine uniqueness as follows. 


Definition. The maximal integral curve of Y € T(T'M) through p € M is the unique 
integral curve y: Ifax — M of Y through p, where 


[P 


La Ut CR | there exists an integral curve y: I > M of Y through p}. 
For a given vector field, in general, [fax will differ from point to point. 
Definition. A vector field is complete if hax = R for all p € M. 
We have the following result. 
Theorem 18.1. On a compact manifold, every vector field is complete. 
On a Lie group, even if non-compact, there are always complete vector fields. 


Theorem 18.2. Every left-invariant vector field on a Lie group is complete. 


The maximal integral curves of left-invariant vector fields are crucial in the construction 
of the map that allows us to go from a Lie algebra to a Lie group. 


18.2 The exponential map 


Let G be a Lie group. Recall that given any A € TG, we can define the uniquely determined 
left-invariant vector field X4 := j(A) via the isomorphism j: T-G > L(G) as 


Xe ly := (lg)x(A). 


Then let y4: R > G be the maximal integral curve of X“ through e € G. 
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Definition. Let G be a Lie group. The exponential map is defined as 


exp: I.G ~G 
Aw exp(A) := 74(1) 


Theorem 18.3. i) The map exp is smooth and a local diffeomorphism around 0 € T.G, 
i.e. there exists an open V C T.G containing 0 such that the restriction 


exp|vy: V > exp(V) CG 
is bijective and both exp|y and (exp|y)~! are smooth. 
ui) If G is compact, then exp is surjective. 


Note that the maximal integral curve of X° is the constant curve y°(\) = e, and 
hence we have exp(0) = e. Then first part of the theorem then says that we can recover a 
neighbourhood of the identity of G from a neighbourhood of the identity of T.G. 

Since T.G is a vector space, it is non-compact (intuitively, it extends infinitely far away 
in every direction) and hence, if G is compact, exp cannot be injective. This is because, by 
the second part of the theorem, it would then be a diffeomorphism T.G — G. But as G is 
compact and 7.G is not, they are not diffeomorphic. 


Proposition 18.4. Let G be a Lie group. The image of exp: TeG — G is the connected 
component of G containing the identity. 


Therefore, if G itself is connected, then exp is again surjective. Note that, in general, 
there is no relation between connected and compact topological spaces, i.e. a topological 
space can be either, both, or neither. 


Example 18.5. Let B: V x V be a pseudo inner product on V. Then 
O(V) = {¢ € GL(V) | Vu,w EV: B(d(v), d(w)) = Biv, w)} 


is called the orthogonal group of V with respect to B. Of course, if B or the base field 
of V need to be emphasised, they can be included in the notation. Every ¢ € O(V) has 
determinant 1 or —1. Since det is multiplicative, we have a subgroup 


SO(V) := {¢@ € O(V) | det d = 1}. 
These are, in fact, Lie subgroups of GL(V). The Lie group SO(V) is connected while 
O(V) = SO(V) U {¢ € O(V) | det ¢ = —-1} 
is disconnected. Since SO(V) contains idy, we have 
50(V) = Tay $O(V) = Tay, O(V) =: 0(V) 


and 


exp(so(V)) = exp(o(V)) = SO(V). 
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Example 18.6. Choosing a basis Aj,...,Adimg of T-G often provides a convenient co- 
ordinatisation of G near e. Consider, for example, the Lorentz group 


O(3, 1) = O(R*) = {A € GL(R*) | Va,y € R*: B(A(z), A(y)) = Bla, y)}, 
where B(a,y) := Nwr"y”, with 0 < p,v < 3 and 


—-1000 
0100 
0010 
0 001 


The Lorentz group O(3,1) is 6-dimensional, hence so is the Lorentz algebra 0(3,1). For 
convenience, instead of denoting a basis of 0(3,1) as {M* | i=1,...,6}, we will denote it 
as {M*"" |0 < p,v < 3} and require that the indices j1,v be anti-symmetric, i.e. 


Mt? = — Me, 


Then M#” = 0 when p = oa, and the set {M"” | 0 < p,v < 3}, while technically not 
linearly independent, contains the 6 independent elements that we want to consider as a 


basis. These basis elements satisfy the following bracket relation 
[Me MP] = n° MHP + ht? MYO — 1 MEO — tO MP. 
Any element  € 0(3,1) can be expressed as linear combination of the MH”, 
ere a 


where the indices on the coefficients w,,, are also anti-symmetric, and the factor of $ ensures 
that the sum over all y,v counts each anti-symmetric pair only once. Then, we have 


A = exp(A) = exp(SHyM"”) € O(3, 1). 


The subgroup of O(3, 1) consisting of the the space-orientation preserving Lorentz transfor- 
mations, or proper Lorentz transformations, is denoted by SO(3, 1). The subgroup consist- 
ing of the time-orientation preserving, or orthochronous, Lorentz transformations is denoted 
by O* (3,1). The Lie group O(3, 1) is disconnected: its four connected components are 


i) SO*(3,1) := SO(3, 1) N OF (3,1), also called the restricted Lorentz group, consisting 
of the proper orthochronous Lorentz transformations; 


ii) SO(3, 1) \ OF (3,1), the proper non-orthochronous transformations; 
iii) O*(3,1) \ SO(3, 1), the improper orthochronous transformations; 


iv) O(3,1) \ (SO(3, 1) UO*(3,1)), the improper non-orthochronous transformations. 
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Since idps € SOT(3,1), we have exp(0(3,1)) = SO*(3,1). Then {M*“”} provides a nice 
co-ordinatisation of SO*(3, 1) since, if we choose 


0 vy yo 3 
—v1 0 93 —p2 
—02-93 0 1 
—v3 2 —-¢1 0 


[wr] = 


then the Lorentz transformation exp(SWy MM”) € SO* (3,1) corresponds to a boost in the 
(v1, W2, ¥3) direction and a space rotation by (v1, y2, 3). Indeed, in physics one often 
thinks of the Lie group SO*(3,1) as being generated by {M#”}. 

A representation p: Tiq,45O* (3, 1) > End(R?*) is given by 


P(e), = 0 8y — hoy 


which is probably how you have seen the M“” themselves defined in some previous course 
on relativity theory. Using this representation, we get a corresponding representation 


R: SOt(3, 1) + GL(R?*) 
via the exponential map by defining 
R(A) = exp(duyuo(M*)). 
Then, the map exp becomes the usual exponential (series) of matrices. 
Definition. A one-parameter subgroup of a Lie group G is a Lie group homomorphism 
€: ROG, 
with R understood as a Lie group under ordinary addition. 


Example 18.7. Let M be a smooth manifold and let Y ¢ [(£M) be a complete vector field. 
The flow of Y is the smooth map 


@:RxM>3M 
(A, p) + Oy(p) := yA), 


where \, is the maximal integral curve of Y through p. For a fixed p, we have 
Oo = idu, Oy, ° 9), = Oy 4r5 @_, = 0,". 


For each A € R, the map Q) is a diffeomorphism M — M. Denoting by Diff(/) the group 
(under composition) of the diffeomorphisms M — M, we have that the map 


€:R > Diff(M) 
Aw 0) 


is a one-parameter subgroup of Diff(1/). 
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Theorem 18.8. Let G be a Lie group. 
i) Let AEG T.G. The map 
EA: RAG 
dH E4(A) == exp(AA) 
is a one-parameter subgroup. 
ii) Every one-parameter subgroup of G has the form €4 for some A € T.G. 


Therefore, the Lie algebra allows us to study all the one-parameter subgroups of the 
Lie group. 


Theorem 18.9. Let G and H be Lie groups and let 6: G > H be a Lie group homomor- 
phism. Then, for all A€ T.,G, we have 


o(exp(A)) = exp((¢x)eg A). 
Equivalently, the following diagram commutes. 


Ts G (dx Jeg 


exp exp 


———r a4 


In particular, for ¢ = Adj: G — G, we have 


Ad,(exp(A)) = exp((Adg, )eA). 
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19 Principal fibre bundles 


Very roughly speaking, a principal fibre bundle is a bundle whose typical fibre is a Lie group. 
Principal fibre bundles are so immensely important because they allow us to understand 
any fibre bundle with fibre F’ on which a Lie group G acts. These are then called associated 
fibre bundles, and will be discussed later on. 


19.1 Lie group actions on a manifold 


Definition. Let (G,e) be a Lie group and let M be a smooth manifold. A smooth map 


b>: GxMoM 
(gp) g>p 


satisfying 
i) VpEeM: ep p=p,; 
ii) Von,gEG:VpEeM: (geg) > p=m > (g2> p), 
is called a left Lie group action, or left G-action, on M. 
Definition. A manifold equipped with a left G-action is called a left G-manifold. 


Remark 19.1. Note that in the above definition, the smooth structures on G and M were 
only used in the requirement that > be smooth. By dropping this condition, we obtain the 
usual definition of a group action on a set. Some of the definitions that we will soon give 
for Lie groups and smooth manifolds, such as those of orbits and stabilisers, also have clear 
analogues to the case of bare groups and sets. 


Example 19.2. Let G be a Lie group and let R: G + GL(V) be a representation of G on a 
vector space V. Define a map 


b>: GxVoV 
(gv) OgbDu:= R(g)v. 


We easily check that e > v := R(e)v = idy v = v and 
(91 © 92) b v:= R(gi © ga)v 
= (R(gi) o R(g2))v 


R(gi)(R(g2)v) 
n> (g2> v), 


for any v € V and any gj, g2 € G. Moreover, if we equip V with the usual smooth structure, 
the map > is smooth and hence a Lie group action on V. It follows that representations of 
Lie groups are just a special case of left Lie group actions. We can therefore think of left 
G-actions as generalised representations of G on some manifold. 
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Definition. Similarly, a right G-action on M is a smooth map 


<a:MxGoM 
(D.g)4pdg 
satisfying 
i) Vpe M: pAg=p,; 
ii) Var,g2 €G:VpEM: pd (gie ge) =(pdgi) <9. 
Proposition 19.3. Let > be a left G-action on M. Then 
<a:MxGoM 
(DR. 9g) pdg:=Gg" Pp 
is a right G-action on M. 


Proof. First note that < is smooth since it is a composition of > and the inverse map on 
G, which are both smooth. We have p< e:=e > p=p and 


p<i(gi¢92) = (g1eg2) 1b p 
= (gy eg,')>p 
= 92' > (g;' > p) 
= 9,' > (pigi) 
= (pi m1) <1 92, 


for all p € M and gy, g2 € G, and hence < is a right G-action. 


Remark 19.4. Since for each g € G we also have g~! € G, if we need some action of G on 
M, then a left action is just as good as a right action. Only later, within the context of 
principal and associated fibre bundles, we will attach separate “meanings” to left and right 
actions. Some of the next definitions and results will only be given in terms of left actions, 


but they obviously apply to right actions as well. 


Remark 19.5. Recall that if we have a basis e1,...,€dim w of T,M and De eid et ate 
the components of some X € TJ; in this basis, then under a change of basis 


~ b 
Ci. = As 


we have X = X “€,, where 

MCAT eX 
Once expressed in terms of principal and associated fibre bundles, we will see that the 
“recipe” of labelling the basis by lower indices and the vector components by upper indices, 
as well as their transformation law, can be understood as a right action of GL(dim M,R) 
on the basis and a left action of the same GL(dim M,R) on the components. 
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Definition. Let G, H be Lie groups, let p: G > H be a Lie group homomorphism and let 


>:GxM-M, 
>:HxNON 


be left actions of G and H on some smooth manifolds M and JN, respectively. Then, a 
smooth map f: M — N is said to be p-equivariant if the diagram 


GxM —!_,HxN 


M > N 


where (p x f)(g,p) := (p(g), f(p)) € H x N, commutes. Equivalently, 


VgEeG:VpEeM: fig>p)=plg)> f(p). 


In other words, if p: G — H is a Lie group homomorphism, then the p-equivariant 


maps are the “action-preserving” maps between the G-manifold M and the H-manifold N. 


Remark 19.6. Note that by setting p = idg or f = idy, the notion of f being p-equivariant 
reduces to what we might call a homomorphism of G-manifolds in the former case, and a 


homomorphism of left actions on M in the latter. 


Definition. Let >: G x M > M be a left G-action. For each p € M, we define the orbit 
of p as the set 


Gp:={qeEM|dgeG:q=gP- ph}. 
Alternatively, the orbit of p is the image of G under the map (— > p). It consists of 
all the points in M that can be reached from p by successive applications of the action b>. 


Example 19.7. Consider the action induced by representation of SO(2,R) as rotation ma- 
trices in End(R?). The orbit of any p € R? is the circle of radius |p| centred at the origin. 
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It should be intuitively clear from the definition that the orbits of two points are either 
disjoint or coincide. In fact, we have the following. 


Proposition 19.8. Let >: Gx M > M be an action on M. Define a relation on M 


prg 3 AgEG:q= 9gPp. 
Then ~ is an equivalence relation on M. 
Proof. Let p,q,r © M. We have 
i) p~ psince p=eD p; 
ii) p~q=>q~psince, ifqg= gb p, then 
p=e>p=(9 eg) >p=g' > (g>p)=g' bg, 
iii) (p~qandq~r)=>per since, ifq=g, > pandr=g. > q, then 


r=g.> (91> p) = (91 ¢g2) > p. 


Therefore, ~ is an equivalence relation on M. 
The equivalence classes of ~ are, by definition, the orbits. 
Definition. Let >: G x M —> M be an action on M. The orbit space of M is 
M/G := M/~={G, |p «€ M}. 


Example 19.9. The orbit space of our previous SO(2, R)-action on R? is the partition of R? 
into concentric circles centred at the origin, plus the origin itself. 


Definition. Let >: G x M > M bea G-action on M. The stabiliser of p € M is 


Sp:={g€G|g> p=p}. 
Note that for each p € M, the stabiliser S, is a subgroup of G. 
Example 19.10. In our SO(2, R) example, we have S, = {idg2} for p £ 0 and Sp = SO(2, R). 
Definition. A left G-action >: G x M — M is said to be 
i) free if for all p € M, we have S, = {e}; 
ii) transitive if for all p,q € M, there exists g € G such that p= gb p. 


Example 19.11. The action >: G x V — V induced by a representation R: G > GL(V) is 
never free since we always have Sp = G. 


Example 19.12. Consider the action >: T(n) x R” > R” of the n-dimensional translation 
group T(n) on R”. We have, rather trivially, T(n), = R” for every p € R” It is also easy 
to show that this action is free and transitive. 
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Proposition 19.13. Let >: Gx M— M be a free action. Then 


nep=gnrp =F QN= ge. 


Proof. The (<) direction is obvious. Suppose that there exist p € M and gj, 92 € G such 
that g1 > p= go > p. Then 


n>p=pep & gy! P(g > p)=93' > (92> p) 
@ (g'en) > p=(g7! eg) >p 
& (9! eg) > p=(eb> p) 
(97 eg) >p=p. 


Hence Ce e gi € Sp, but since > is free we have S, = {e}, and thus g; = go. 


Proposition 19.14. If>: Gx M > M is a free action, then 
VpeG: Gp ~ag G. 


Example 19.15. Define >: SO(2,R) x R? \ {0} > R? \ {0} to coincide with the action 
induced by the representation of SO(2,R?) on R? for each non-zero point of R?. Then this 
action is free, since we have Sp = {idg2} for p ~ 0, and the previous proposition implies 


Vp € R?\ {0}: SO(2,R)p Zar SO(2,R) Yaig S?. 


19.2 Principal fibre bundles 


This is a good time to review our earlier section on bundles. We can specialize our definition 
of bundle to define a smooth bundle, which is just a bundle (E,2,M) where E and M are 
smooth manifolds and the projection 7: E — M is smooth. Two smooth bundles (F, 7, M) 
and (E’,7’, M’) are isomorphic if there exist diffeomorphisms u, f such that the following 
diagram commutes 

E —*—- E! 


M—/—.m’ 


Definition. Let G be a Lie group. A smooth bundle (£,7,M) is called a principal G- 
bundle if E is equipped with a free right G-action and 


E E 
| =pdl |p 
M E/G 


where p is the quotient map, defined by sending each p € E to its equivalence class (i.e. 
orbit) in the orbit space E/G. 
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Observe that since the right action of G on E is free, for each p € E we have 
preim, (Gp) = Gp Sairr G. 


We said at beginning that, roughly speaking, a principal bundle is a bundle whose fibre at 
each point is a Lie group. Note that the formal definition is that a principal G-bundle is a 
bundle which is isomorphic to a bundle whose fibres are the orbits under the right action 
of G, which are themselves isomorphic to G since the action is free. 


Remark 19.16. A slight generalisation would be to consider smooth bundles E +» M 
where E is equipped with a right G-action which is free and transitive on each fibre of 
E +» M. The isomorphism in our definition enforces the fibre-wise transitivity since G 
acts transitively on each G, by the definition of orbit. 


Example 19.17. a) Let M be asmooth manifold. Consider the space 
LpM := {(e1,.--,€dimm) | €1,--+) dimm is a basis of T,M} =vec GL(dim M,R). 


We know from linear algebra that the bases of a vector space are related to each other 
by invertible linear transformations. Hence, we have 


LyM vec GL(dim M,R). 
We define the frame bundle of M as 


LM := [| LpM 
pEeM 


with the obvious projection map 7: LM — M sending each basis (e1,..., aim) to 
the unique point p € M such that (e1,..., dim) is a basis of T,M. By proceeding 
similarly to the case of the tangent bundle, we can equip LM with a smooth structure 
inherited from that of /. We then find 


dim LM = dim M + dimT,M = dim M + (dim M)?. 


b) We would now like to make LM > M into a principal GL(dim M,R)-bundle. We 
define a right GL(dim M, R)-action on LM by 


(e1,.-.,€dimm) <9 := (9%1€a;--- 5 9" dim mea) 


where g%, are the components of the endomorphism g € GL(dim M,R) with respect 


to the standard basis on R". Note that if (e1,...,édimi) € LpM, we must also have 
(€1,---,€dimm) Ig € L,M. This action is free since 
(eno. pba ar) SOS Wives yea i) SO eas 50" area) S| Cys Cana) 


and hence, by linear independence, g*, = 6;, so g = idpn. Note that since all bases 
of each TM are related by some g € GL(dim M,R), < is also fibre-wise transitive. 
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c) We now have to show that 


LM LM 
| =padl |p 
M LM / GL(dim M,R) 


i.e. that there exist smooth maps u and f such that the diagram 


LM 3 : > LM 
ut 
T p 
; 
M ~——> LM/GL(dim M,R) 
f- 


commutes. We can simply choose u = u~! = idz ys, while we define f as 


f: M— LM/GL(dim M,R) 
p++ GL(dim M, R), 


€1,+;€dim M)? 


where (€1,...,€dim ™) is some basis of TM, i.e. (e1,...,€dimm) € preim,({p}). Note 
that f is well-defined since every basis of T;,,M gives rise to the same orbit in the orbit 
space LM ', GL(dim M,R). Moreover, it is injective since 


f(y) =f(p') & GL(dim M,R) GL(dim M,R) 


(€1,.-€dim Mm) — (C4 y€4im Mm)? 


which is true only if (€1,..., €aim az) and (e4,..., €4:,, 47) are basis of the same tangent 
space, so p = p’. It is clearly surjective since every orbit in LM / GL(dim M,R) is the 
orbit of some basis of some tangent space T,M at some point p € M. The inverse 
map is given explicitly by 
ae LM /GL(dim M,R) + M 
GL(dim M, Ree ewe +> m((€1,---, dim M))- 
Finally, we have 


(pc idzm)(e1,---,€dimm) = GL(dim M,R)(e,,.. eaim as) = (Ff 0 7)(€1,-+ + Cdim M) 


and thus LM +> M is a principal G-bundle, called the frame bundle of M. 


Remark 19.18. A note to the careful reader. As we have just done in the previous example, 
in the following we will sometimes simply assume that certain maps are smooth, instead of 


rigorously proving it. 
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19.3. Principal bundle morphisms 


Recall that a bundle morphism (also called simply a bundle map) between two bundles 
(E,7,M) and (E’, 7’, M’) is a pair of maps (u, f) such that the diagram 


commutes, that is, fom=7' ou. 


Definition. Let (P,7,M) and (Q,7’, N) be principal G-bundles. A principal bundle mor- 
phism from (P,7,M) to (Q,7', N) is a pair of smooth maps (u, f) such that the diagram 


P —“+—-+ Q 


<IG 4G 


mM—_!—4.N 


commutes, that is for all p € P and g € G, we have 
(f 0 7)(p) = (x’ ou)(p) 
u(p ig) = u(p) <9. 


Note that P —“°-> P is a shorthand for the inclusion of P into the product Px G 
followed by the right action <j, i.e. 


| ceed -_ iS = PG ep 
<G 


and similarly for @Q ———> Q. 


Definition. A principal bundle morphism between two principal G-bundles is an isomor- 
phism (or diffeomorphism) of principal bundles if it is also a bundle isomorphism. 


Remark 19.19. Note that the passage from principal bundle morphism to principal bundle 
isomorphism does not require any extra condition involving the Lie group G. We will soon 
see that this is because the two bundles are both principal G-bundles. We can further 


generalise the notion of principal bundle morphism as follows. 
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Definition. Let (P,7,M) be a principal G-bundle, let (Q, 7’, N) be a principal H-bundle, 
and let p: G > H be a Lie group homomorphism. A principal bundle morphism from 
(P, 7, M) to (Q,7’, N) is a pair of smooth maps (u, f) such that the diagram 


—— 2 

<q < 
PG" 0x8 
it ay 
P “ > Q 


M—!? -5N 


commutes, that is fom = 7’ ou and wu is a p-equivariant map 
VpEP:VgeG: upg) =u(p) < p(y). 


Definition. A principal bundle morphism between principal G-bundle and a principal H- 
bundle is an isomorphism (or diffeomorphism) of principal bundles if it is also a bundle 
isomorphism and p is a Lie group isomorphism. 


Lemma 19.20. Let (P,7,M) and (Q,7’,M) be principal G-bundles over the same base 
manifold M. Then, any u: P + Q such that (u,idyz) ts a principal bundle morphism is 
necessarily a diffeomorphism. 


P = > 


Q 
IG « 
Q 


2 . 


\Z 


Proof. We already know that u is smooth since (u, idjz) is assumed to be a principal bundle 
morphism. It remains to check that u is bijective and its inverse is also smooth. 


i) Let p1,p2 € P be such that u(p1) = u(p2). Then 


n(p1) = nm’ (u(p1)) = m (u(p2)) = (pe), 
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that is, py and p2 belong to the same fibre. As the action of G on P is fibre-wise 
transitive, there is a unique g € G such that p; = p2 <1 g. Then 


u(pi) = u(p2 <1 g) = u(p2) <9 = u(pi) <9, 
80 g € Sy(p,), but since < is free, we have g = e and thus 
Pi = pz Je = po. 
Therefore u is injective. 
ii) Let g € Q. Choose some p € preim,(z/(q)). Then, we have 
m'(u(p)) = (p) = m'(q) 


so that u(p) and q belong to the same fibre. Hence, there is a unique g € G such that 
q = u(p) 4g. We thus have 


q=u(p) 4g =ulp <Q) 


and since p < g € P, the map u is surjective. 


Hence, u is a diffeomorphism. 


Definition. A principal G-bundle (P,7,M) if it is called trivial if it is isomorphic as a 
principal G-bundle to the principal G-bundle (M x G,7,,M) where 7 is the projection 
onto the first component and the action is defined as 


4:(MxG)xGo>MxG 
((p,9),9') > (p, 9) 49! = (p,g 09’). 


By the previous lemma, a principal G-bundle (P, 7, M) is trivial if there exists a smooth 
map u: P+ M x G such that the following diagram commutes. 


P — + MxG 


<IG <G 


P —™ _+ MxG 


\ 4 


The following result provides a necessary and sufficient criterion for when a principal 
bundle is trivial. Note that while we have used the lower case letter p almost exclusively to 
denote points of the base manifold M, in the next proof we will use it to denote points of 
the total space P instead. 
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Theorem 19.21. A principal G-bundle (P,7,M) is trivial if, and only if, there exists a 
smooth section 0 € T'(P), that is, a smooth o: M —> P such that moo = idy. 


Proof. (=) Suppose that (P,7,M) is trivial. Then there exists a diffeomorphism u: P > 


M x G which make the following diagram commute 


P 


<IG 


PVG 


N\A 


o:M—->P 


mre ut(m, e), 


We can define 


where e is the identity of G. Then o is smooth since it is the composition of u~! with 
the map p'> (p,e), which are both smooth. We also have 


(wT 00)(pm = m(u~*(m, e)) = m(m,e) =m, 
for all m € M, hence 700 = idy and thus o € T(P). 


Suppose that there exists a smooth section 0: M — P. Let p € P and consider the 
point o(m(p)) € P. We have 


m(o(m(p))) = idar((p)) = 7 (p), 


hence o(z(p)) and p belong to the same fibre, and thus there exists a unique group 
element in G which links the two points via <. Since this element depends on both 
a and p, let us denote it by x-(p). Then, y> defines a function 


Xo: POG 
P+ Xo(p) 
and we can write 
VpeP: p=o(n(p)) I Xo(p). 
In particular, for any other g € G we have p < g € P and thus 


pJg=o(n(p<g)) I xo(p dg) = a(t (p)) I xXo(p <9), 


where the second equality follows from the fact that the fibres of P are precisely the 
orbits under the action of G. 
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On the other hand, we can act on the right with an arbitrary g € G directly to obtain 
pg =(o(n(p)) S Xo(p)) Sg = o(7(p)) < (Xo(p) © 9). 
Combining the last two equations yields 


o(m(p)) I Xo(p dg) =a(7(p)) < (Xo(P) © g) 


and hence 
Xo(p <9) = (Xo(p) © 9). 


We can now define the map 


Ug: P+ MxG 
p++ (1(p); Xo(p))- 


By our previous lemma, it suffices to show that u, is a principal bundle map. 


p——* _,MxG 


<IG <G 


By definition, we have 


(171 0 Ug)(p) = T(7(P), Xo(p)) = 7(p) 


for all p € P, so the lower triangle commutes. Moreover, we have 


for all p € P and g € G, so the upper square also commutes and hence (P,7, M) is a 


trivial bundle. 


Example 19.22. The existence of a section on the frame bundle LM can be reduced to the 


existence of (dim MM) non-everywhere vanishing linearly independent vector fields on M. 


Since no such vector field exists on even-dimensional spheres, LS?” is always non-trivial. 
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20 Associated fibre bundles 


An associated fibre bundle is a fibre bundle which is associated (in a precise sense) to a 
principal G-bundle. Associated bundles are related to their underlying principal bundles in 
a way that models the transformation law for components under a change of basis. 

20.1 Associated fibre bundles 


Definition. Let (P,7,M) be a principal G-bundle and let F' be asmooth manifold equipped 
with a left G-action >. We define 


i) Pp := (Px F)/~ga, where ~g is the equivalence relation 


p=p<g 
fl=g'of 


(p, f) ~a (p', f’) oS 39€G:{ 


We denote the points of Pr as [p, f]. 
ii) The map 


TR: Pr—->M 
Ip, f] 4 m(p), 


which is well-defined since, if [p’, f’] = [p, f], then for some g € G 
wr((P, f']) =a (ps 9,9°' > f]) = 7(p 9) = 2(p) =: wr (([p, f)). 


The associated bundle (to (P,7,M), F and >) is the bundle (Pr, 7p, M). 


Example 20.1. Recall that the frame bundle (LM,7,M) is a principal GL(d,R)-bundle, 
where d = dim M, with right G-action d: DM x G— LM given by 


(Cis gy ag Co yea. sing yea) 
Let F := R¢ (as a smooth manifold) and define a left action 


>: GL(d,R) x R? > R? 
(Gyn) oa; 
where 
(goa) 942°. 


Then (L.Mpa, Tra, R®) is the associated bundle. In fact, we have a bundle isomorphism 


LMga ——“—> TM 
Td TT 
M EM M 
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where (TM, 7, M) is the tangent bundle of M, and u is defined as 
u:  LMge —3TM 


[(e1,---,€a), 2] > @%eq. 


The inverse map u-!: TM -—+ LMpa works as follows. Given any X € TM, pick any 
basis (€1,...,¢a) of the tangent space at the point m(X) € M, i.e. any element of Lz(x)M. 
Decompose X as x%e€,, with each x* € R, and define 


aw *(X) 2= [(e1;- «+5 €g), 4] 


l is well-defined since, while the pair ((e1,..., eq), 7) € LM xR? clearly depends 


The map u7 
on the choice of basis, the equivalence class 


[(e1,...,€q), 2] € LMpa := (LM x R®)/~g 


does not. It includes all pairs ((e1,...,€a) <. g,g-' & x) for every g € GL(d,R), i.e. every 
choice of basis together with the “right” components x € R¢. 


Remark 20.2. Even though the associated bundle (LMga, pa, R®) is isomorphic to the 
tangent bundle (['M,7,M), note a subtle difference between the two. On the tangent 
bundle, the transformation law for a change of basis and the related transformation law for 
components are deduced from the definitions by undergraduate linear algebra. 

On the other hand, the transformation laws on LMpa were chosen by us in its definition. 
We chose the Lie group GL(d,R), the specific right action < on LM, the space R®, and the 
specific left action on R?. It just happens that, with these choices, the resulting associated 
bundle is isomorphic to the tangent bundle. Of course, we have the freedom to make 
different choices and construct bundles which behave very differently from TM. 


Example 20.3. Consider the principal GL(d,R)-bundle (LM,7,M) again, with the same 
right action as before. This time we define 


F := (R%)*? x (R&)*9 = R¢ x +e x REx RY x +e x RO 
ee 
p times q times 


with left GL(d, R)-action >: GL(d,R) x F > F given by 


a —1\b —1\b aya 
(g> ei are = an s< 95, (9 ~) * bi verge) ao i eae 


Then, the associated bundle (LMp,7r, M) thus constructed is isomorphic to (JT? M, 7, M), 
the (p, q)-tensor bundle on M. 


Now for something new, consider the following. 


Definition. Let M be a smooth manifold and let (ZM,7,M) be its frame bundle, with 
right GL(d, R)-action as above. Let F := (R“)*? x (R@")*? and define a left GL(d, R)-action 
on F' by 


as _ —1\b _1\b, G1: 
(g> OM ee := (det g Ne Is, se 9°" 5, (9 ‘) i mg ‘) ae ys eae 
where w € Z. Then the associated bundle (L Mp, 7p, M) is called the (p, q)-tensor w-density 
bundle on M, and its sections are called (p,q)-tensor densities of weight w. 
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Remark 20.4. Some special cases include the following. 
i) If w = 0, we recover the (p,q)-tensor bundle on M. 


ii) If F =R (ie. p=q =O), the left action reduces to 


(g> f) = (det g-*)* f, 
which is the transformation law for a scalar density of weight w. 


iii) If GL(d,R) is restricted in such a way that we always have (det g~') = 1, then tensor 
densities are indistinguishable from ordinary tensor fields. This is why you probably 
haven’t met tensor densities in your special relativity course. 


Example 20.5. Recall that if B is a bilinear form on a K-vector space V, the determinant 
of B is not independent from the choice of basis. Indeed, if {e} and {e), := g%ea} are both 
basis of V, where g € GL(dim V, Kk’), then 


(det B)’ = (det g~')? det B. 


Once recast in the principal and associated bundle formalism, we find that the determinant 
of a bilinear form is a scalar density of weight 2. 
20.2. Associated bundle morphisms 


Definition. Let (Pr, tr, M) to (Qr, 7p, N) be the associated bundles (with the same fibre 
F’) of two principal G-bundles (P,7,M) and (Q,7’, N). An associated bundle map between 
the associated bundles is a bundle map (u,v) between them such that for some u, the pair 
(u,v) is a principal bundle map between the underlying principal G-bundles and 


u(lp, fl) = [u(p), FI. 


Equivalently, the following diagrams both commute. 


=e 
rere <IG <G 
TF my P—*—+Q 
M ——>N 1 i 
M—%->N 


Definition. An associated bundle map (u,v) is an associated bundle isomorphism if u and 


v are invertible and (w~!,v~') is also an associated bundle map. 
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Remark 20.6. Note that two associated F-fibre bundles may be isomorphic as bundles but 
not as associated bundles. In other words, there may exist a bundle isomorphism between 
them, but there may not exist any bundle isomorphism between them which can be written 
as in the definition for some principal bundle isomorphism between the underlying principal 


bundles. 
Recall that an F-fibre bundle (£,7, M) is called trivial if there exists a bundle isomorphism 


F + MxF 


wy 


while a principal G-bundle is called trivial if there exists a principal bundle isomorphism 


P ——+*—_ > MxG 
<IG <G 


P — > MxG 


\ 4 


Definition. An associated bundle (Pr,7r, M) is called trivial if the underlying principal 
G-bundle (P,7, M) is trivial. 


Proposition 20.7. A trivial associated bundle is a trivial fibre bundle. 


Note that the converse does not hold. An associated bundle can be trivial as a fibre 
bundle but not as an associated bundle, i.e. the underlying principal fibre bundle need not 
be trivial simply because the associated bundle is trivial as a fibre bundle. 


Definition. Let H be a closed Lie subgroup of G. Let (P,7,M) be a principal H-bundle 
and (Q,7’, M) a principal G-bundle. If there exists a principal bundle map from (P, 7, M/) 
to (Q, 7’, M), i.e. asmooth bundle map which is equivariant with respect to the inclusion of 
H into G, then (P,7, M) is called an H-restriction of (Q,7', M), while (Q, 7’, M) is called 
a G-extension of (P,7,M). 


Theorem 20.8. Let H be a closed Lie subgroup of G. 
i) Any principal H-bundle can be extended to a principal G-bundle. 


ii) A principal G-bundle (P,a,M) can be restricted to a principal H-bundle if, and only 
if, the bundle (P/H,7',M) has a section. 
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Example 20.9. i) The bundle (LM/SO(d), 7, M) always has a section, and since SO(d) 


is a closed Lie subgroup of GL(d, R), the frame bundle can be restricted to a principal 
SO(d)-bundle. This is related to the fact that any manifold can be equipped with a 
Riemannian metric. 


The bundle (LM/SO(1,d—1),7,M) may or may not have a section. For example, 
the bundle (L.$?/SO(1,1),7, $7) does not admit any section, and hence we cannot 
restrict (L.S?/SO(1,1),7, $7) to a principal SO(1, 1)-bundle, even though SO(1, 1) is 
a closed Lie subgroup of GL(2,R). This is related to the fact that the 2-sphere cannot 
be equipped with a Lorentzian metric. 
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21 Connections and connection 1-forms 


In elementary courses on differential geometry or general relativity, the notions of con- 
nection, parallel transport and covariant derivative are often confused with one another. 
Sometimes, the terms are even used as synonyms. If you have seen any of that before, it is 
probably best to forget about it for the time being. 

What a connection really is, is just additional structure on a principal bundle consisting 
is a “smooth” assignment of a particular vector space at each point of the base manifold 
compatible with the right action of the Lie group on the principal bundle. Such an assign- 
ment is, in fact, equivalent to a certain Lie-algebra-valued one-form on the principal bundle, 
as we will discuss below. Later, we will see that a connection on a principal bundle induces 
a parallel transport map on the principal bundle, which in turn induces a parallel transport 
map on any of its associated bundles. If the fibres of the associated bundle carry a vector 
space structure, then the parallel transport can be used to define a covariant derivative on 
the associated bundle. 

Hence the conceptual sequence “connection, parallel transport covariant derivative” is in 
decreasing order of generality, and it should be clear that treating these terms as synonyms 


will inevitably lead to confusion. We will now discuss the first of these in some detail. 


21.1 Connections on a principal bundle 


Let (P,7,™M) be a principal G-bundle. Recall that every element of A € T.G gives rise to 
a left invariant vector field on G which we denoted by X4. However, we will now reserve 
this notation for a vector field on P instead. Given A € T.G, we define X4 € T(TP) by 


X#:C°(P) >R 
f + [f(m < exp(tA))](0), 
where the derivative is to be taken with respect to t. We also define the maps 


ip leG = lo 
Aw ee 


which can be shown to be a Lie algebra homomorphism. 


Definition. Let (P,7,M) be a principal bundle and let p € P. The vertical subspace at p 
is the vector subspace of T,,P given by 


V,P := ker((t)p) 
= {Xp € TpP | («)p(Xp) = OF. 


Lemma 21.1. For all AG T-G and p € P, we have xs EVP. 


Proof. Since the action of G simply permutes the elements within each fibre, we have 


™(p) = 1(p <l exp(tA)), 
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for any t. Let f € C*(M) be arbitrary. Then 


(TW )pXf(f) = XP (f om) 

[(f © 7)(p < exp(tA))}'(0) 
[f(m(p))]"(0) 
0, 


since f(m(p)) is constant. Hence xe € V,P. Alternatively, one can also argue that (aa )pxe 


is the tangent vector to a constant curve on M. 


In particular, the map ip: T.G > V,P is now a bijection. The idea of a connection 
is to make a choice of how to “connect” the individual points in “neighbouring” fibres in a 
principal fibre bundle. 


Definition. Let (P,7,M) be a principal bundle and let p € P. A horizontal subspace at p 
a vector subspace H,P of T,P which is complementary to V,P, i.e. 


TyP = HyP ® VpP. 


The choice of horizontal space at p € P is not unique. However, once a choice is made, 
there is a unique decomposition of each X, € T,P as 


Xp = hor(Xp) + ver(Xp), 
with hor(X,) € H,P and ver(X,) € VP. 


Definition. A connection on a principal G-bundle (P, 7, M) is a choice of horizontal space 
at each p € P such that 


i) For all g € G, p€ P and X, € HyP, we have 
(<1 g)«Xp € HpagP, 


where (<I g)x is the push-forward of the map (— < g): P > P and it is a bijection. 
We can also write this condition more concisely as 


(<4 9)«(HpP) = HpagP. 


ii) For every smooth X € T'(TP), the two summands in the unique decomposition 
X|p = hor(X|p) + ver(X|p) 
at each p € P, extend to smooth hor(X), ver(X) € T'(TM). 


The definition formalises the idea that the assignment of an H,P to each p € P should 
be “smooth” within each fibre (i) as well as between different fibres (ii). 


Remark 21.2. For each X, € T,P, both hor(X,) and ver(X,) depend on the choice of H,P. 
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21.2 Connection one-forms 


Technically, the choice of a horizontal subspace H,P at each p € P providing a connection 
is conveniently encoded in the thus induced Lie-algebra-valued one-form 


oes BD a = 1G 


Xp > Wy(Xp) = 1 


p (ver(Xp)) 


Definition. The map w: p — wy, sending each p € P to the T.G-valued one-form wy is 
called the connection one-form with respect to the connection. 


Remark 21.3. We have seen how to produce a one-form from a choice of horizontal spaces 
(i.e. a connection). The choice of horizontal spaces can be recovered from w by 


HP = ker(wy). 


Of course, not every (Lie-algebra-valued) one-form on P is such that ker(wp) gives a 
connection on the principal bundle. What we would now like to do is to study some crucial 
properties of w. We will then elevate these properties to a definition of connection one- 
form absent a connection, so that we may re-define the notion of connection in terms of a 
connection one-form. 


Lemma 21.4. For allp€ P,g€G and A€ET.G, we have 


(ha: yA 
(<1 Dax = Xpag 


Proof. Let f € C®(P) be arbitrary. We have 


(< g).X/'(f) = Xf (fo (- <9) 
= [f(p ie <1 9)]'(0) 
=[f(p dg <g7' <exp(ta 
=[f(p<ga(g' ( 
=(f(pig< eae 


eexp(ta 


= [f(p <.g <1 exp(t(Ad,-1),A)}'(0) 
= x Cp), 


which is what we wanted. 


Theorem 21.5. A connection one-form w with respect to a connection satisfies 


a) For all p € P, we have Wp (XZ) =A, that ts Wp Oty = ida. 


Lo === P 
ide Wp|Vp P 
TG 
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b) ((<.g)"w)|p(Xp) = (Adg-1)x(Wp(Xp)) 


Tp»P ——?— > T.G 
(Ad, 1)» 
((<9)") Ip 
sae! 


c) w is a smooth one-form. 


Proof. a) Since xe € V,P, by definition of w we have 


wp Xe) =", (ver X?) =p OYA. 


b) First observe that the left hand side is linear in X,. Consider the two cases 


b.1) Suppose that X, € V,P. Then X, = xe for some A € T.G. Hence 


((<1 g)*w)|p(XP) = wpag((< 9)eXf') 
(Ad —1)"A 
= wpag( pig. ) 


= (Ad,-1)4A 


= (Ady-1)«(wp(X}")) 


b.2) Suppose now that X, € H,pP = ker(w,). Then 
((< g)*w)|p(Xp) = wpag((<I g)xXp) = 0 
since (<] g)xXp € HpagP = ker(wpag)- 
Let X, € T,P. We have 


((< g)*w)|p(Xp) = + hor(X>)) 


) + (<1 g)"w)|p(hor(Xp)) 
p))) +0 
) 
) 


p))) + (Adg-1)x(p(hor(Xp))) 


c) We have w = 77! 0 ver and both i~! and ver are smooth. 
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22 Local representations of a connection on the base manifold: Yang- 
Mills fields 


We have seen how to associate a connection one-form to a connection, i.e. a certain Lie- 
algebra-valued one-form to a smooth choice of horizontal spaces on the principal bundle. 
We will now study how we can express this connection one-form locally on the base manifold 
of the principal bundle. 


22.1 Yang-Mills fields and local representations 


Recall that a connection one-form on a principal bundle (P,7, M) is a smooth Lie-algebra- 


valued one-form, i.e. a smooth map 
w:T(TP) > T.G 


which “behaves like a one-form”, in the sense that it is R-linear and satisfies the Leibniz 
rule, and such that, in addition, for all A € T.G, g € Gand X € T(TP), we have 


iyow(X4) =A: 
ii) ((< g)*w)(X) = (Ady-1)«(w(X)). 
If the pair (u, f) is a principal bundle automorphism of (P,7, M), i.e. if the diagram 
ar 


<IG <IG 


a ~~ 


M—_!—5™mM 


commutes, we should be able to pull a connection one-form w on P back to another con- 


nection one-form u*w on P. 


E(eP) =—* — Pe 
[(TP) 
Recall that for a one-form w: T(T.N) — C°(N), we defined 


&*(w): 1(TM) > C~(M) 
X ++ w(®.(X)) 0 
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for any diffeomorphism ¢@: M — N. One might be worried about whether this and similar 
definitions apply to Lie-algebra-valued one-forms but, in fact, they do. In our case, even 
though w lands in T.G, its domain is still [(7P) and if u: P + P is a diffeomorphism of 
P, then u.X € T(TP) and so 


uw: X + (u*w)(X) = w(us.(X)) ou 


is again a Lie-algebra-valued one-form. Note that we will no longer distinguish notationally 
between the push-forward of tangent vectors and of vector fields. 

In practice, e.g. for calculational purposes, one may wish to restrict attention to some 
open subset U C M. Let 0: U > P be a local section of P, i.e. too = idy. 


P 


n 


<G 


Definition. Given a connection one-form w on P, such a local section o induces 
i) a Yang-Mills field w’: 1(TU) > T-G given by 


We = OW; 


ii) a local trivialisation of the principal bundle P, i.e. a map 
h:UxG—>P 
(m,g) > o(m) <9; 
iii) a local representation of w on U by 


h*w: T(T(U x G)) > TG. 


Note that, at each point (m,g) € U x G, we have 
Tim,g)(U x G) tiealg TmU ® TyG. 


Remark 22.1. Both the Yang-Mills field w” and the local representation h*w encode the 
information carried by w locally on U. Since h*w involves U x G while w” doesn’t, one 
might guess that h*w gives a more “accurate” picture of w on U than the Yang-Mills field. 
But in fact, this is not the case. They both contain the same amount of local information 
about the connection one-form w. 
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22.2 The Maurer-Cartan form 


Th relation between the Yang-Mills field and the local representation is provided by the 


following result. 
Theorem 22.2. For allu €T,,U andy € T,G, we have 
(A*w) (m,g) (0,7) = (Adg-1)«(w (v)) + By(), 
where =, is the Maurer-Cartan form 
Ey: 1,G > T.G 
EL rH A, 


Remark 22.3. Note that we have represented a generic element of TG as Le This is due 
to the following. Recall that the left translation map ¢,: G — G is a diffeomorphism of G. 
As such, its push-forward at any point is a linear isomorphism. In particular, we have 


((Eg)ade: TeG => TG, 


that is, the tangent space at any point g € G can be canonically identified with the tangent 


space at the identity. Hence, we can write any element of TyG as 


for some A € T..G. 
Let us consider some specific examples. 


Example 22.4. Any chart (U, x) of asmooth manifold M induces a local section 0: U > LM 
of the frame bundle of M by 


oe (Ce (samu)_) € mM. 


Since GL(dim M,R) can be identified with an open subset of R(i™™)* we have 
Te GL(dim M,R) =tiealg R(dim My? 


where R(4im™)” is understood as the algebra of dim M x dim M square matrices, with 
bracket induced by matrix multiplication. In fact, this holds for any open subset of a 


vector space, when considered as a smooth manifold. A connection one-form 
w:T(LM) — T,. GL(dim M,R) 
can thus be given in terms of (dim M)? functions 


w',: (LM) > R, 1<i,j <dimM. 
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The associated Yang-Mills field wY := o*w is, at each point m € U, a Lie-algebra-valued 
one-form on the vector space T,,U. By using the co-ordinate induced basis and its dual 
ne 


basis, we can express (w” ),, in terms of components as 


(wm = wy (m) (dat), 


or hal ()} 


Since (WY): TmU — TeG, we have ae U + T.G. Hence, by employing he same isomor- 


where 1 < uw <dimM™ and 


phism as above, we can identify each wt! (m) with a square dim M x dim M matrix and 
define the symbol 

Ty (m) = (w (m))*jn = (we (m))";, 
usually referred to as the Christoffel symbol. The middle term is just an alternative notation 
for the right-most side. Note that, even though all three indices 7, 7, run from 1 to dim M, 
the numbers l, ,(™m) do not constitute the components of a (1, 2)-tensor on U. Only the yu 


index transforms as a one-form component index, i.e. 
((9 & w (m))"s)u = (97 1)” (wh (m))"5, 
for g € GL(dim M,R), while the i, 7 indices simply label different one-forms, (dim M)? in 


total. 


Note that the Maurer-Cartan form appearing in Theorem 22.2 only depends on the Lie 
group (and its Lie algebra), not on the principal bundle P or the restriction U C M. In the 
following example, we will go through the explicit calculation of the Maurer-Cartan form 
of the Lie group GL(d,R). 

Example 22.5. Let (GL*(d,R),z) be a chart on GL(d,R), where GLt(d,R) denotes an 
open subset of GL(d,R) containing the identity idga, and let a GLt (d,R) — R denote 


the corresponding co-ordinate functions 
GLt (d,R) ——*—> 2(GLt(d,R)) C R® 


proj’, 


R 


so that ae Gy = 9 ;- Recall that the co-ordinate functions are smooth maps on the chart 
domain, i.e. we have x, € C~(GL*(d,R)). Also recall that to each A € Tia,g GL(d, R) 
there is associated a left-invariant vector field 


LA: C°(GL*(d,R)) ~> C°(GLt(d, R)) 
which, at each point g € GL(d,R), is the tangent vector to the curve 


A(t) = ge exp(tA). 
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Consider the action of L4 on the co-ordinate functions: 


where we have used the fact that for a matrix Lie group, the exponential map is just the 


ordinary exponential 


Hence, we can write 


from which we can read-off the Maurer-Cartan form of GL(d, R) 
Gi = (g 7 )ipda* ye: 
Indeed, we can quickly check that 
= \i [A — ¢,-lyi d k PAT 0 
(Ey)*;( )=(9- "x6 x”; )q Pr q ax”. 
a7 


= (971) ng? A" 5p 54 


q°P-j 
= (97) A ; 
_ gi 
= A’, 


22.3 The gauge map 


In physics, we are often prompted to write down a Yang-Mills field because we have local 
information about a connection. We can then try to reconstruct the global connection by 
glueing the Yang-Mills fields on several open subsets of our manifold. 


(YN 


Suppose, for instance, that we have two open subsets U;,U2 C M and consider the Yang- 
Mills fields associated to two local connections 01, 09. If w4! and w”? are both local versions 
of a unique connection one-form, then is U; N Uz € @, the Yang-Mills fields w4! and w”” 
should satisfy some compatibility condition on U; M U2. 
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Definition. Within the above set-up, the gauge map is the map 
Q: U;NU2 9G 
where, for each m € U,M U2, the Lie group element 2Q(m) € G satisfies 
o2(m) = o1(m) J Q(m). 


Note that since the G-action < on P is free, for each m there exists a unique (2(m) 


satisfying the above condition, and hence the gauge map 2) is well-defined. 


Theorem 22.6. Under the above assumptions, we have 
(w? Jin = (Adq-1(m))s(w74) + (QE y)m- 


Example 22.7. Consider again the frame bundle LM of some manifold M. Let us evaluate 
explicitly the pull-back along 9 of the Maurer-Cartan form. Since =,: T7G — T.G and 
Q: U;NU2 > TG, we have 0*5,: T(U; U2) > T.G. Let x be a chart map near the point 
m € U1, U2. We have 


(02 y)m¥i( (gaz) ) = Eoomy'i(% (gaz) ) 


= (m)")', GF orn (2. Ca) 
= (Q(m)) i(2. Oak ae ms 

= (Q(m)~") «(gen), he 

= (Q(m)) (gm), om Ms 


hence, we can write 


((O°Z,)m)é; = cag), ( 
= (Q-" dM)’. 


a) oe 
ae (Q(m))";dax 
Let us now compute the other summand. Recall that Ad, is the map 


Adg:G—-G 


hrygeheg! 


and since Ad,(e) = e, the push-forward ((Adg)«)e: TeG — T-G is a linear endomorphism 
of T.G. Moreover, since here G = GL(d, R) is a matrix Lie group, we have 


((Adg)+A)*; = 9°, A",(g"*)'; =: (gAg™')';- 
Hence, we have 


(Adg—1(my)«(w) = (Q(m)*), (W)m)"(A(m))"; 
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Altogether, we find that the transition rule for the Yang-Mills fields on the intersection of 
U; and U2 is given by 


(wi?) = (OY (Ww) 5 + (OY (O77). 


As an application, consider the spacial case in which the sections 01,02 are induced by 
co-ordinate charts (U1, x) and (U2, y). Then we have 


i Oy’ i -1 
i= Bai Say og en 
Ox? 
-lyi a 7 —1 
(MNF = 75 = Al(elow oy 


and hence 


jv ax” \ Ayk Hyd % Oy* Ox" Oxd 


You may recognise this as the transformation law for the Christoffel symbols from general 
relativity. 


(wl2)i. = BE (Se tule dy’ Oa' APy* i: 
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23 Parallel transport 


We now come to the second term in the sequence “connection, parallel transport, covariant 
derivative”. The idea of parallel transport on a principal bundle hinges on that of horizontal 
lift of a curve on the base manifold, which is a lifting to a curve on the principal bundle in 
the sense that the projection to the base manifold of this curve gives the curve we started 
with. In particular, if the principal bundle is equipped with a connection, we would like to 
impose some extra conditions on this lifting, so that it “connects” nearby fibres in a nice 
way. We will then consider the same idea on an associated bundle and see how we can 


induce a derivative operator if the associated bundle is a vector bundle. 
23.1 Horizontal lifts to the principal bundle 
Definition. Let (P,7,M) be a principal G-bundle equipped with a connection and let 
y: [0,1] > M be acurve on M. The horizontal lift of y through po € P is the unique curve 
4: [0,1] + P 
with y'(0) = po € preim, ({7(0)}) satisfying 
i) moyt= 


ii) VA € [0,1]: ver(Xy+ vray) = 9; 


iii) VA € (0,1): me (Xp yr—y) = X49): 


Intuitively, a horizontal lift of a curve 7 on M is a curve 7! on P such that each point 
4'(A) € P belongs to the fibre of y(A) (condition i), the tangent vectors to the curve 7! 
have no vertical component (condition ii), ie. they lie entirely in the horizontal spaces at 
each point, and finally the projection of the tangent vector to y! at y'(A) coincides with 
the tangent vector to the curve y at 7(y"(A)) = y()). 


Remark 23.1. Note that the uniqueness in the above definition only stems from the choice of 
po € preim,({7(0)}). A curve on M has several horizontal lifts to a curve on P, but there 
is only one such curve going through each point po € preim,({7(0)}). Clearly, different 
horizontal lifts cannot intersect each other. 


Our strategy to write down an explicit expression for the horizontal lift through po € P 
of a curve y: [0,1] + M is to proceed in two steps: 


i) “Generate” the horizontal lift by starting from some arbitrary curve 6: [0,1] > P such 
that 706 = 7 by action of a suitable curve g: (0,1) > G so that 


7" (A) = 6A) <9). 


The suitable curve g will be the solution to an ordinary differential equation with 
initial condition g(0) = go, where go is the unique element in G such that 


(0) JI g0 =Po€ P. 


—191—- 


ii) We will explicitly solve (locally) this differential equation for g: [0,1] + P by a path- 
ordered integral over the local Yang-Mills field. 


We have the following result characterising the curve g appearing above. 


Theorem 23.2. The (first order) ODE satisfied by the curve g: [0,1] > G is 
(Adgay-1)*(W5(a)(X5,6(a))) + Zgcay(Xg,9(a)) = 9 
with the initial condition g(0) = go. 
Corollary 23.3. If G is a matrix group, then the above ODE takes the form 
9(A)* say (X56) 9) + 9A) *G(A) = 0 


where we denoted matriz multiplication by juxtaposition and g(A) denotes the derivative 
with respect to of the matrix entries of g. Equivalently, by multiplying both sides on the 


left by g(A), 
G(A) = —(wsry(X5,5(a))) 9): 


In order to further massage this ODE, let us consider a chart (U, x) on the base manifold 


M, such that the image of y is entirely contained in U. A local section 0: U + P induces 
i) a Yang-Mills field w; 
ii) a curve on P by 6 :=a07%. 


In fact, since the only condition imposed on 6 is that 70 6 = y, choosing a such a curve 6 


is equivalent to choosing a local section a. Note that we have 


Ox(Xy (ry) = X56) 


and hence 


where [,, := wl and 4#(A) := X4(7(A)), together with the initial condition g(0) = go. 


= AO2s= 


23.2 Solution of the horizontal lift ODE by a path-ordered exponential 


As a first step towards the solution of our ODE, consider 


iO) t= / dP y((d)) 490). 


This doesn’t seem to have brought us far since the function g that we would like to determine 
appears again on the right hand side. However, we can now iterate this definition to obtain 


t Al 
alt) = 90 f ast yoy") (m - favs Tv (y(22))4"C2) ) 
t t At 
Te / AUT, GOayeOuer* / dd fad Ty(y(n))HODELE4002))4" Aa)4Q2)- 


Matters seem to only get worse, until one realises that the first integral no longer contains 
the unknown function g. Hence, the above expression provides a “first-order” approximation 
to g. It is clear that we can get higher-order approximations by iterating this process 


att) =90— | * aaa Dun) ¥#(0rx)90 


t At 
+ / AM fo dd2Pu(on))¥"Qx)Eo402))4"A2)0 


7 t AL Net: 
+(-1)" i dy i ahi / An Dy y(n) (a) «DL (7(An))4” On) 9A): 


Note how the range of each integral depends on the integration variable of the previous 
integral. It would much nicer if we could have the same range in each integral. In fact, 
there is a standard trick to achieve this. The region of integration in the double integral is 


ADA 


> 


0 t Mt 


and if the integrand f(A1, A2) is invariant under the exchange A; < A2, we have 


t AL 1 t t 
fa Da f(Ar,22) = 5 f a [ arr Oude). 
0 0 2 Jo 0 


Generalising to n dimensions, we have 


t M4 1 re t 
fae Dn FO dn) = fads fads FO, An) 
0 0 n. Jo 0 
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if f is invariant under any permutation of its arguments. Moreover, since each term in our 
integrands only depends on one integration variable at a time, we can use 


for fan non - fra(An) = ([ asson) (fase ”) 


so that, in our case, we would have 
n t n 
(f[ ar.coir) )o 


g(t) = (> S 
=ex(- a rut i(A))H0)) 


However, our integrands are Lie-algebra-valued (that is, matrix valued), and since the 


factors therein need not commute, they are not invariant under permutations of the inde- 
pendent variables. Hence, the above formula doesn’t work. Instead, we write 


att) = Pexp(— ff aara(09)i"0)) oo 


where the path-ordered exponential P exp is defined to yield the correct expression for g(t). 


Summarising, we have the following. 


Proposition 23.4. For a principal G-bundle (P,2,M), where G is a matrix Lie group, the 
horizontal lift of a curve y: [0,1] + U through py € preim,({U}), where (U,x) is a chart 
on M, is given in terms of a local section a: U + P by the explicit expression 


WA) =(07)) a (Pew(- “a ray)" Jao). 


Definition. Let ae [0,1] + P be the horizontal lift through p € preim,({y(0)}) of the 
curve y: [0,1] + M . The parallel transport map along y is the map 
Ty: preim,({7(0)}) + preim,({7(1)}) 
pry yp(1). 


Remark 23.5. The parallel transport is, in fact, a bijection between the fibres preim, ({y(0) }) 
and preim,({7(1)}). It is injective since there is a unique horizontal lift of y through each 
point p € preim,({y(0)}), and horizontal lifts through different points do not intersect. 
It is surjective since for each q € preim,({y(1)}) we can find a p such that q = yf (1) as 
follows. Let p € preim,({7(0)}). Then y(1) belongs to the same fibre as g and hence there 


exists a unique g € G such that ¢g = 7(1) <i g. Recall that 


yA) = (6 0 7)(A) < (Pexp(--- )g0) 
where go is the unique go € G such that p = (0° 7)(0) < go. Define p € preim,({7(0)}) by 


p:=p<g=(c07)(0) < (9009). 
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Then we have 


4h (1) = (a 0 y)(1) < (Pexp(---)g0¢ 9) 
= (70 7)(1) < (Pexp(---)go) dg 
= (1) ag 
= 


Loops and holonomy groups 


Consider the case of loops, i.e. curves y: [0,1] - M for which 7(0) = (1). Fix some 
p € preim,({7(0)}). The condition that 7 0 vy} = y then implies that +} (0) and 4} (1) 
belong to the same fibre. Hence, there exists a unique gy € G such that 


aL) =f (0) < 9 =P dg. 
Definition. Let w be a connection one-form on the principal G-bundle (P,7,M). Let 


y: [0,1] > M be a loop with base-point a € M, i.e. y(0) = y(1) = a. The subgroup of G 


Hola(w) := {gy | yp (1) =p <1 gy for some loop y} 
is called the holonomy group of w on P at the base-point a. 


23.3 Horizontal lifts to the associated bundle 


Almost everything that we have done so far transfers with ease to an associated bundle via 
the following definition. 


Definition. Let (P,7,M) be a principal G-bundle and w a connection one-form on P. Let 
(Pr,7Fr,M) be an associated fibre bundle of P on whose typical fibre F' the Lie group G 
acts on the left by >. Let y: [0,1] ~ M be a curve on M and let om be its horizontal lift 
to P through p € preim,({y(0)}). Then the horizontal lift of y to the associated bundle 
Pr through the point |p, f] € Pr is the curve 

Pr 


ye As [0,1] > Pr 
Ae pO), f] 
For instance, we have the obvious parallel transport map. 


Definition. The parallel transport map on the associated bundle is given by 


Ty : preim,,.({7(0)}) > preim,,,.({7(1)}) 


Pr 


Ip, f] reap (1). 


Remark 23.6. If F' is a vector space and >: Gx F —> F is fibre-wise linear, i.e. for each fixed 
g €G, the map (g > —): F > F is linear, then (Pr,7p, M) is called a vector bundle. The 
basic idea of a covariant derivative is as follows. Let 0: U + Pr be a local section of the 
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associated bundle. We would like to define the derivative of o at the point m € U C M in 
the direction X € T,,M. By definition, there exists a curve y: (—¢,¢) ~ M with 7(0) =m 


PR 
such that X = Xm. Then for any 0 < t < e, the points Nestuas hi (t) and o(7(t)) lie in 
the same fibre of Pr. But since the fibres are vector spaces, we can write the differential 


quotient 
Pr 


a(y(t)) - Vener (t) 
t p) 


where the minus sign denotes the additive inverse in the vector space preim,,,({7(t)}) and 


hence define the derivative of o at the point m in the direction X, or the derivative of o 


along y at 7(0) = m, by 


Pr 
4 
a(y(t)) — t 
Pa (y(t)) VNetose ) 
t30 t 


(of course, this makes sense as soon as we have a topology on the fibres). We will soon 


present a more abstract approach. 
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24 Curvature and torsion on principal bundles 


Usually, in more elementary treatments of differential geometry or general relativity, cur- 
vature and torsion are mentioned together as properties of a covariant derivative over the 
tangent or the frame bundle. Since we will soon define the notion of curvature on a general 
principal bundle equipped with a connection, one might expect that there be a general 
definition of torsion on a principal bundle with a connection. However, this is not the case. 
Torsion requires additional structure beyond that induced by a connection. The reason why 
curvature and torsion are sometimes presented together is that frame bundles are already 
equipped, in a canonical way, with the extra structure required to define torsion. 


24.1 Covariant exterior derivative and curvature 


Definition. Let (P,7,™M) be a principal G-bundle with connection one-form w. Let ¢ be 
a k-form (i.e. an anti-symmetric, C°(P)-multilinear map) with values in some module V. 
Then then exterior covariant derivative of @ is 


Dé: T(rP)**t) ov 
(X1, aes »Xk41) eT dé(hor(X1), tet , hor(X%41)). 


Definition. Let (P,7,M) be a principal G-bundle with connection one-form w. The cur- 
vature of the connection one-form w is the Lie-algebra-valued 2-form on P 


Q:T(TP) x T(TP) > TG 


defined by 
Q := Dw. 


For calculational purposes, we would like to make this definition a bit more explicit. 


Proposition 24.1. Let w be a connection one-form and Q its curvature. Then 
Q=dwtwAw (x) 
with the second term on the right hand side defined as 
(wWAw)(X,Y) = [w(X), oY] 
where X,Y € TI (TP) and the double bracket denotes the Lie bracket on TG. 


Remark 24.2. If G is a matrix Lie group, and hence 7T.G is an algebra of matrices of the 
same size as those of G, then we can write 


Proof. Since 2 is C°-bilinear, it suffices to consider the following three cases. 
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a) Suppose that X,Y ¢€ I'(TP) are both vertical, that is, there exist A,B € T.G such 
that X = X4 and Y = X¥. Then the left hand side of our equation reads 


Q(X4, X*®) := Dw(X4, X?) 
= dw(hor(X4), hor(X*)) 
= dw (0,0) 
=0 
while the right hand side is 
di XX?) (GW) aX aX? a XX) 
— w([X*,X7]) + [o(X*), w(X*)] 
= X4(B) — X®(A) 
— (X11) + [A, Bl 
= —[A, B] + [A, B] 
=0. 
Note that we have used the fact that the map 
i: TeG > T(TP) 
Anwxa 
is a Lie algebra homomorphism, and hence 
XIAP) — i([A, B]) = [i(A), i(B)] = [X4,X"], 
where the single square brackets denote the Lie bracket on (TP). 
b) Suppose that X,Y € [(LP) are both horizontal. Then we have 
Q(X, Y) := Dw(X,Y) = dw(hor(X), hor(Y)) = dw(X,Y) 
and 
(wAw)(X,Y) = [w(X),w(Y)] = 0, 0] = 0. 
Hence the equation holds in this case. 


c) W.Lo.g suppose that X € I'(TP) is horizontal while Y = X4 € I(TP) is vertical. 
Then the left hand side is 


Q(X, X4) := Dw(X, X4) = dw(hor(X), hor(X4)) = dw(hor(X), 0) = 0. 
while the right hand side gives 
dw(X, X4) + (wAw)(X, X4) = X(w 
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where the only non-trivial step, which is left as an exercise, is to show that if X is 


horizontal and Y is vertical, then [X,Y] is again horizontal. 


We would now like to relate the curvature on a principal bundle to (local) objects 
on the base manifold, just like we have done for the connection one-form. Recall that a 
connection one-form on a principal G-bundle (P,7,M) is a T.-G-valued one-form w on P. 
By using the notation Q'(P) @ T.G for the collection (in fact, bundle) of all T.G-valued 
one-forms, we have w € Q'(P) @T.G. If ¢ € (TU) is a local section on M, we defined the 
Yang-Mills field w4 € Q1(U) @ T-G by pulling w back along o. 


Definition. Let (P,7,M) bea principal G-bundle and let 2 be the curvature associated to 
a connection one-form on P. Let o € [(TU) be a local section on M. Then, the two-form 


Riem = F := o0* 1 € 07(U) @ TG 
is called the Yang-Mills field strength. 
Remark 24.3. Observe that the equation Q = dw+wAw on P immediately gives 


o*Q = o0*(dw+wAw) 
= 0* (dw) + o*(w Aw) 


= d(o*w) + o*wAo*w. 
Since Riem is a two-form, we can write 
Riempy = (dw), + we Aw, 


In the case of a matrix Lie group, by writing , Ve (wY) we can further express this 


a . 
jp 
in components as 


Riem’,, = Onl is = daly + Ply = 1g agen 


from which we immediately observe that Riem is symmetric in the last two indices, i.e. 
Riem! 


jf] = 9. 


Theorem 24.4 (First Bianchi identity). Let Q be the curvature two-form associated to a 
connection one-form w on a principal bundle. Then 


DQ =0. 


Remark 24.5. Note that since Q = Dw, Bianchi’s identity can be rewritten as D?Q = 0. 
However, unlike the exterior derivative d, the covariant exterior derivative does not satisfy 
D? = 0 in general. 
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24.2 Torsion 


Definition. Let (P,7,M) be a principal G-bundle and let V be the representation space 
of a linear (dim /)-dimensional representation of the Lie group G. A solder(ing) form on 
P is a one-form 6 € 21(P) @ V such that 


(i) VX ET(TP): O(ver(X)) = 0; 
(ii) Vg EG: gb ((<g)*O) = 8; 
(iii) TM and Py are isomorphic as associated bundles. 


A solder form provides an identification of V with each tangent space of M. 


Example 24.6. Consider the frame bundle (LM,7,M) and define 


6: 1(T(LM)) > R&@“ 


-1 
XH (urcx) © T,)(X) 
where for each e := (€1,..-,€gimm) € DUM, ue is defined as 
dimM ~ 
Ue: Row" Tr(e)M 
(osc nt) 4 eee: 
To describe the inverse map uz‘ explicitly, note that to every frame (€1,...,é€dimm) € LM, 


there exists a co-frame (f',..., f4™™) € L*M such that 


21: Trey) M > ROMM 


ge ian 0 ae jee AN 


Definition. Let (P,z,M) be a principal G-bundle with connection one-form w and let 
6€0'(P)@V bea solder form on P. Then 


6 :=DAEN(P)@V 
is the torsion of w with respect to @. 


Remark 24.7. You can now see that the “extra structure” required to define the torsion is 
a choice of solder form. The previous example shows that there a canonical choice of such 


a form on any frame bundle bundle. 


We would like to have a similar formula for © as we had for Q. However, since 0 and 
@ are both V-valued but w is T.G-valued, the term w A6@ would be meaningless. What we 
have, instead, is the following 


© =dO+wAd, 


where the half-double wedge symbol intuitively indicates that we let w act on 6. More 
precisely, in the case of a matrix Lie group, recalling that dimG = dim7T.G = dim V, we 


have 
O' = da + wt, AF. 
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Theorem 24.8 (Second Bianchi identity). Let © be the torsion of a connection one-form 
w with respect to a solder form @ on a principal bundle. Then 


DO=OA8. 


Remark 24.9. Like connection one-forms and curvatures two-forms, a torsion two-form O 
can also be pulled back to the base manifold along a local section ao as T := o*@. In fact, 


this is the torsion that one typically meets in general relativity. 
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25 Covariant derivatives 


Recall that if F' is a vector space and (P,7,M) a principal G-bundle equipped with a 
connection, we can use the parallel transport on the associated bundle (Pr, ar, M) and the 
vector space structure of F' to define the differential quotient of a local section a: U + Pr 
along an integral curve of some tangent vector X € TU. This then allowed us to define the 
covariant derivative of o at the point 7(X) € U in the direction of X € TU. 

This approach to the concept of covariant derivative is very intuitive and geometric, 
but it is a disaster from a technical point of view as it is quite difficult to implement. There 
is, in fact, a neater approach to covariant differentiation, which will now discuss. 


25.1 Equivalence of local sections and equivariant functions 


Theorem 25.1. Let (P,7,M) be a principal G-bundle and (Pr,tr,M) be an associated 
bundle. Let (U,x) be a chart on M. The local sections 0: U + Pr are in bijective corre- 
spondence with G-equivariant functions @: preim,(U) C P > F, where the G-equivariance 
condition is 


Vg eG:Vpepreim,(U): (9 dp) =g9™' > Op). 
Proof. (a) Let dpreim,(U) > F be G-equivariant. Define 
Og: U-> Pr 
m+ [p, o(p)] 


where p is any point in preim,({m}). First, we should check that og is well-defined. 
Let p,p € preim,({m}). Then, there exists a unique g € G such that p = p < g. 
Then, by the G-equivariance of ¢, we have 


[P, 6(P)] = [pI 9, ¢(p <1 9)] = [p< 9,97* © O(p)] = [p, o(0)] 


and hence, o¢ is well-defined. Moreover, since for all g € G 


tr([p, (p)]) = tv) = 7(p 4 9) = te([p i g,9°-* & Ov), 
we have mr 0 og = idy and thus, o¢ is a local section. 
(b) Let 0: U > Pr be a local section. Define 


go: preim,(U) > F 


where 7, ! is the inverse of the map 


iy: F — preim,,.({1(p)}) © Pr 
fi Ip, fl. 


Observe that, for all g € G, we have 


ip(f) = |p, f] = [p<g,g7* > f] =: inaglG > f). 
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Let us now show that ¢, is G-equivariant. We have 


$o(p Ig) = ipag(o(m(p <1 9))) 
= ipag(o(m(P))) 
= tpq(ip(¢o(P))) 

= izaglipag(g > b0(p))) 

a ie > ba(p), 

which is what we wanted. 
(c) We now show that these constructions are the inverses of each other, i.e. 
Tbs — 9; Poe = @. 


Let m € U. Then, we have 


0,(m) = [p, bo(p)] 


g¢o,(P) = tp '(o4(m(p))) 


and hence, $5, = @. 


25.2 Linear actions on associated vector fibre bundles 


We now specialise to the case where F is a vector space, and hence we can require the left 
action GD: F >> F to be linear. 


Proposition 25.2. Let (P,a,M) be a principal G-bundle, and let (Pr,ar,M) be an asso- 
ciated bundle, where G is a matrix Lie group, F' is a vector space, and the left G-action on 
F is linear. Let 6: P > F be G-equivariant. Then 


(p <1 exp(At)) = exp(—At) > $(p), 
where p€ P and AE T.G. 


Corollary 25.3. With the same assumptions as above, let A€ TeG and let w be a connec- 
tion one-form on (P,2,M). Then 


do(X4) +w(X4) b 6=0. 
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Proof. Since ¢ is G-equivariant, by applying the previous proposition, we have 
o(p <i exp(At)) = exp(—At) > o(p) 
for any p € P. Hence, differentiating with respect to t yields 
($(p < exp(At)))'(0) = (exp(—At) > $(p))'(0) 
dpo(X“) =—A > o(p) 
dpo(X*) = -w(X“) > o(p) 


for all p € P and hence, the claim holds. 


25.3 Construction of the covariant derivative 


We now wish to construct a covariant derivative, i.e. an “operator” V such that for any 
local section a: U C M > Pp and any X € T;,U with m € U, we have that V.xo is again 
a local section U > Pr and 


i) VFXx4Y¥o = fVxo4+Vyo 
iil) Vx(o+7) =Vxo4+VxtT 
iii) Vxfo =X(f)o+ fVxo 


for any sections 0,7: U > Pr, any f € C~@(U), and any X,Y € T,,,U. 

These (together with Vx f := X(f)) are usually presented as the defining properties 
of the covariant derivative in more elementary treatments. 

Recall that functions are a special case of forms, namely the 0-forms, and hence the 


exterior covariant derivative a function ¢: P > F is 
D@ := d¢o hor. 
We now have the following result. 
Proposition 25.4. Let ¢: P — F be G-equivariant and let X € T,P. Then 
DO(X) = do(X) +u(X) bo 
Proof. (a) Suppose that X is vertical, that is, X = X4 for some A € T.G. Then, 
Dé(X) = de(hor(X)) = 0 
and 
d¢(X“) +.w(X4) pb ¢=0 
by the previous corollary. 
(b) Suppose that X is horizontal. Then, 
Do(X) = do(X) 


and w(X) = 0, so that we have 


Dé(X) = dd(X) +u(X) b 6. 
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Hence, it is clear from this proposition that Dé(X), which we can also write as Dx 4@, is 
C~°(P)-linear in the X-slot, additive in the ¢-slot and satisfies property iii) above. However, 
it also clearly not a covariant derivative since X € TP rather than X € TM and disa 
G-equivariant function P — F rather than a local section o (Pr, 7p, M). 

We can obtain a covariant derivative from D by introducing a local trivialisation on 
the bundle (P,7,M). Indeed, let s: U C M — P be a local section. Then, we can pull 
back the following objects 


¢:P oF ae s*@:=~08:U + Pr 
we O'(M)@T-G ~ w := stwEO(U) @T.G 
D¢€0'(M)@F ~ s*(Dd) € 1(U) @ F. 


It is, in fact, for this last object that we will be able to define the covariant derivative. 
Let X € TU. Then 


(s°D¢)(X) = s*(dd+wb 6)(X) 
= s"(d¢)(X) + s*(w > o)/(X 
= d(s"¢)(X) + s*(w)(X) > 
= do(X)+w4(X) po 


) 
8" 


where we renamed s*¢ =: 0. In summary, we can write 
Vxo =do(X)+w(X) po 


One can check that this satisfies all the properties that we wanted a covariant derivative to 


satisfy. Of course, we should note that this is a local definition. 


Remark 25.5. Observe that the definition of covariant derivative depends on two choices 


which can be made quite independently of each other, namely, the choice of connection 
one-form w (which determines w”) and the choice of linear left action > on F. 
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Further readings 


There are several books covering some, most, all, or much more of the course material. 
e Baez, Muniain, Gauge Fields, Knots and Gravity, World Scientific 1994 


e Choquet-Bruhat, DeWitt-Morette, Analysis, manifolds, and physics, Part I: Basics 
(Revised edition), North-Holland 1982 


e Choquet-Bruhat, DeWitt-Morette, Analysis, manifolds, and physics, Part II (Revised 
and enlarged edition), North-Holland 2000 


e Fecko, Differential geometry and Lie groups for physicists, Cambridge University Press 
2006 


e Frankel, The Geometry of Physics: An Introduction (Third edition), Cambridge Uni- 
versity Press 2011 


e Hamilton, Mathematical Gauge Theory: With Applications to the Standard Model of 
Particle Physics, Springer 2017 
(preliminary version http: //www.mathematik.uni-muenchen.de/~hamilton/gaugetheory. 


php) 


e Hassani, Mathematical Physics: A Modern Introduction to Its Foundations (Second 
edition), Springer 2013 


e Lam, Topics in Contemporary Mathematical Physics (Second edition), World Scientific 
2015 


e Naber, Topology, Geometry and Gauge Fields: Foundations (Second edition), Springer 
2010 


e Nakahara, Geometry, Topology and Physics (Second edition), Institute of Physics 
Publishing 2003 


Rudolph, Schmidt, Differential Geometry and Mathematical Physics: Part I. Mani- 
folds, Lie Groups and Hamiltonian Systems, Springer 2012 


Rudolph, Schmidt, Differential Geometry and Mathematical Physics: Part II. Fibre 
Bundles, Topology and Gauge Fields, Springer 2017 


e Schutz, Geometrical Methods of Mathematical Physics, Cambridge University Press 
1980 


Szekeres, A Course in Modern Mathematical Physics: Groups, Hilbert Space and Dif- 
ferential Geometry, Cambridge University Press 2004 


Westenholz, Differential Forms in Mathematical Physics (Revised edition), North 
Holland 1978 


Topic-specific references follow. 
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Logic 
e Chiswell, Wilfrid, Mathematical Logic, Oxford University Press 2007 


e Hammack, Book of Proof, Virginia Commonwealth University 
http://www. people.vcu.edu/~rhammack/BookOfProof/ 


e Mendelson, Introduction to Mathematical Logic, CRC Press 2015 


Set theory 
e Adamson, A Set Theory Workbook, Birkhauser 1997 


e Smullyan, Fitting, Set theory and the continuum problem, Oxford University Press 


1996 


e Takeuti, Zaring, Introduction to Axiomatic Set Theory, Springer 1981 


Topology 
e Adamson, A General Topology Workbook, Birkhauser 1995 


e Kalajdzievski, An Illustrated Introduction to Topology and Homotopy, CRC Press 2015 


e Munkres, Topology (Second edition), Pearson 2014 


Topological manifolds 


e Lee, Introduction to Topological Manifolds (Second edition), Springer 2011 


Linear algebra 


e Janich, Linear algebra, Springer 1994 
e Lang, Linear Algebra (Third edition), Springer 1987 


e Shakarchi, Solutions Manual for Lang’s Linear Algebra, Springer 1996 


Differentiable manifolds 


e Gadea, Masqué, Mykytyuk, Analysis and Algebra on Differentiable Manifolds: 


Workbook for Students and Teachers (Second Edition), Springer 2013 
e Janich, Vector Analysis, Springer 2001 
e Lang, Fundamentals of Differential Geometry, Springer 1999 
e Lee, Introduction to Smooth Manifolds (Second edition), Springer 2012 


e Warner, Foundations of Differentiable Manifolds and Lie Groups, Springer 1983 
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A 


Lie groups and Lie algebras 


e Das, Okubo, Lie Groups and Lie Algebras for Physicists, World Scientific 2014 
e Erdmann, Wildon, Introduction to Lie Algebras, Springer 2006 


e Hall, Lie Groups, Lie Algebras, and Representations: An Elementary Introduction 
(Second edition), Springer 2015 


e Kirillov, An Introduction to Lie Groups and Lie Algebras, Cambridge University Press 
2008 


e Stillwell, Naive Lie Theory, Springer 2008 


Representation theory 


e Fuchs, Schweigert, Symmetries, Lie Algebras and Representations: A Graduate Course 
for Physicists, Cambridge University Press 1997 


e Fulton, Harris, Representation Theory: A First Course, Springer 2004 

e Zee, Group Theory in a Nutshell for Physicists, Princeton University Press 2016 
Principal bundles 

e Sontz, Principal Bundles: The Classical Case, Springer 2015 


e Sontz, Principal Bundles: The Quantum Case, Springer 2015 
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